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The invention relates to the field of virology. 

Recently, a respiratory illness (atypical pneumonia) was diagnosed in an 8 
months old patient that could not be attributed to SARS (Severe Acute Respiratory 
Syndrome) virus or any other known viral infection. The patient tested negative for 
influenza, parainfluenza, mumps and RSV and yet the disease was identified to be 
caused by a virus which closely resembled SARS. 

For being able to trace its origin, monitor its epidemiology and prevent possible 
spreading of the disease, it is of great importance to be able to recognise viral causes c 
pneumonia in an early stage. Especially, if severe diseases are found to be caused by 
viruses, it is necessary to detect the identity of the virus as soon as possible, in order t 
develop diagnostic tools and possibly therapies. The SARS epidemy has shown that it 
paramount for prevention of spread of the disease to be able to get an early diagnosis i 
order to timely take effective isolation measures en initiate quarantaine precautions. 
Only then, world-wide contaminations can be prevented. 

Furthermore, identification of the viral cause for the disease enables 
development of vaccines, which can be used prophylactically to protect people who are 
risk of being infected. And, finally, knowledge of the viral cause enables to develop 
therapeutic measures. 

Thus, there is great need in developing diagnostic tools and therapies for viral 
pneumonias in general, and particular to a novel disease-causing infectious agent, 
especially when this agent appears to be a virus. 

The invention provides the nucleotide sequence of an isolated essentially 
mammalian positive-sense s^ 

which is the causative fector for the new disease, hereinafter referred to as EMCR-CoV 
and the disease being referred to as EMCR-CoV-caused pneumonia. From a phylogenet 
analysis of the Matrix and Nucleocapsid gene sequences of the virus (Fig. 2a and 2b) it 
appears that the virus is a distinct member of the group formed by PEDV (porcine 
epidemic diarrhea virus), HCoV-229E (human coronavirus 229E), PRCoV (porcine 
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respiratory coronavirus), TGEV (transmissible gastroenteritis virus), CaCoV (Canine 
coronavirus) and FeCoV (feline coronavirus). Based on amino acid identity matrices, 
human coronavirus 229E seems to be the closest relative (for all ORFs with the 
exception of Matrix which appears to be slightly more closely related to PEDV - see 
5 Figure 3). 

Although phylogenetic analyses provide a convenient method of identifying a 
virus, several other possibly more straightforward albeit somewhat more coarse 
methods for identifying said virus or viral proteins or nucleic acids from said virus are 
herein also provided. As a rule of thumb an EMCR-Coronavirus can be identified by the 
1 0 percentages of homology of the virus, proteins or nucleic acids to be identified in 
comparison with viral proteins or nucleic acids identified herein by sequence. It is 
generally known that virus species, especially RNA virus species, often constitute a 
quasi species wherein a cluster of said viruses displays heterogeneity among its 
members. Thus it is expected that each isolate may have a somewhat different 
1 5 percentage relationship with the sequences of the isolate as provided herein. 

When one wishes to compare a virus isolate with the sequences as listed in figure 
lmolo, the invention provides an isolated essentially mammalian positive -sense single 
stranded RNA virus (EMCR-CoV) belonging to the Coronaviruses and identifiable as 
phylogenetically corresponding thereto by determining a nucleic acid sequence of said 
2 0 virus and determining that said nucleic acid sequence has a percentage nucleic acid 

identity to the sequences as listed higher than the percentages identified herein for the 
nucleic acids as identified herein below in comparison with PEDV, 229E, PRCoV, TGE\ 
CaCoV and FeCoV. Likewise, an isolated essentially mammalian positive-sense single 
stranded RNA virus (EMCR-CoV) belonging to the Coronaviruses and identifiable as 

2 5 phylogenetically corresponding thereto by determining an amino acid sequence of said 

virus and determining that said amino acid sequence has a percentage amino acid 
homology to the sequences as listed which is essentially higher than the percentages 
provided herein in comparison with PEDV, 229E, PRCoV, TGEV, CaCoV and FeCoV. 
With the provision of the sequence information of this EMCR-Coronavirus 

3 0 (EMCR-CoV), the invention provides diagnostic means and methods, prophylactic meai 

and methods and therapeutic means and methods to be employed in the diagnosis, 
prevention and/or treatment of disease, in particular of respiratory disease (atypical 
pneumonia), in particular of mammals, more in particular in humans associated with 
infection by this virus. In virology, it is most advisory that diagnosis, prophylaxis andA 
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treatment of a specific viral infection is performed with reagents that are most specif 
for said specific virus causing said infection. In this case this means that it is prefers 
that said diagnosis, prophylaxis and/or treatment of an EMCR-CoV virus infection is 
performed with reagents that are most specific for EMCR-CoV virus. This by no mean 
however excludes the possibility that less specific, but sufficiently cross-reactive 
reagents are used instead, for example because they are more easily available and 
sufficiently address the task at hand. 

The invention for example provides a method for virologicaUy diagnosing an 
EMCR-CoV infection of an animal, in particular of a mammal, more in particular of a 
human being, comprising determining in a sample of said animal the presence of a vir 
isolate or component, thereof by reacting said sample with an EMCR-CoV specific nucl 
acid or antibody according to the invention, and a method for serologically diagnosing 
EMCR-CoV infection of a mammal comprising determining in a sample of said mamm 
the presence of an antibody specifically directed against an EMCR-CoV virus or 
component thereof by reacting said sample with an EMCR-CoV virus-specific 
proteinaceous molecule or fragment thereof or an antigen according to the invention. 

The invention also provides a diagnostic kit for diagnosing an EMCR-CoV 
infection comprising an EMCR-CoV virus, an EMCR-CoV virus-specific nucleic acid, 
proteinaceous molecule or fragment thereof, antigen and/or an antibody according to tl 
invention, and preferably a means for detecting said EMCR-CoV virus, EMCR-CoV 
virus-specific nucleic acid, proteinaceous molecule or fragment thereof, antigen and/or 
an antibody, said means for example comprising an excitable group such as a 
fluorophore or enzymatic detection system used in the art (examples of suitable 
diagnostic kit format comprise IF, EUSA, neutralization assay, RT-PCR assay). To 
determine whether an as yet unidentified virus component or synthetic analogue there, 
such as nucleic acid, proteinaceous molecule or fragment thereof can be identified as 
EMCR-CoV-virus-specific, it suffices to analyse the nucleic acid or amino acid sequence 
of said component, for example for a stretch of said nucleic acid or amino acid, 
preferably of at least 10, morepreferably at least 25, more preferably at least40 
nucleotides or amino acids (respectively), by sequence homology comparison with the 
provided EMCR-CoV viral sequences and with known non-EMCR-CoV viral sequences 
(human coronavirus 299E is preferably used) using for example phylogenetic analyses a 
provided herein. Depending on the degree of relationship with said EMCR-CoV or non- 
EMCR-CoV viral sequences, the component or synthetic analogue can be identified. 



The invention thus provides the nucleotide sequence of a novel etiological agent, , 
an isolated essentially mammalian positive-sense single stranded RNA virus (herein 
also called EMCR-CoV virus) belonging to the Coronaviridae family, and EMCR-CoV 
virus-specific components or synthetic analogues thereof. 

Coronaviruses were first isolated from chickens in 1937, while the first human 
coronavirus was propagated in vitro by Tyrell and Bonoe in 1965. There are now about 
13 species in this family, which infect cattle, pigs, rodents, cats, dogs, birds and man. 
Coronavirus particles are irregularly shaped, about 60-220 nm in diameter, with an 
outer envelope bearing distinctive, 'club-shaped' peplomers ( about 20 nm long and 10 
nm wide at the distal end). This 'crown-like' appearance give the family its name. The 
envelope carries two glycoproteins: S, the spike glycoprotein which is involved in cell 
fusion and is a major antigen, and M, the membrane glycoprotein, which is involved in 
budding and envelope formation. The genome is associated with a basic phosphoprotein, 
designated N. The genome of coronaviruses, a single stranded positive-sense RNA 
strand, is typically 27-31 Kb long and contains a 5' methylated cap and a 3' poly-A tail, ■ 
by which it can directly function as an mRNA in the infected cell. Initially the 5' ORF 1 
(about 20 Kb) is translated to produce a viral polymerase, which then produces a full 
length negative sense strand. This is used as a template to produce mRNA as a 'nested 
set" of transcripts, all with identical 5' non-translated leader sequence of 72 nucleotides 
and coincident 3' polyadenylated ends. Each mRNA thus produced is monocistronic, the 
genes at the 5' end being translated from the longest mRNA and so on. These unusual 
cytoplasmic structures are produced not by sphcing, but by the polymerase during 
transcription. Between each of the genes there is a repeated intergenic sequence - 
AACUAAAC - which interacts with the transcriptase plus ceUular factors to splice the 
5 leader sequence onto the start of each ORF. In some coronaviruses there are about 8 
ORFs, coding for the proteins mentioned above, but also for a heamagglutenin esterase 
(HE), and several other non-structural proteins. 

Newly isolated viruses are phylogenetically corresponding to and thus 
taxonomically corresponding to EMCR-CoV virus when comprising a gene order and/or 
0 amino acid sequence and/or nucleotide sequence sufficiently similar to our prototypic 
EMCR-CoV virus. The highest amino acid sequence identity, between ORFs of EMCR- 
CoV virus and any of the known other viruses of the same family to date are withhumai 
coronavirus 299E or Porcine Epidemic Diarrhea Virus (see Figures 3 and 4), The amino 
acid identities with human coronavirus 229E ranges from 45% (Nucleoprotein) to 81% 
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(Eeplicase lb); interestingly, Replicase la has an identity of just 56% contrasting wit] 
Replicase lb's 81% identity. EMCR CoV has a closer identity with human coronaviru* 
229E than with any of the known other viruses of the same family to date for all 
putative ORFs, with the exception of Matrix, which is slightly more closely related to 
5 the Matrix ORF of PEDV. Individual proteins or whole virus isolates with, respective] 
higher homology than these mentioned maximum values are considered 
phylogeneticaJly corresponding and thus taxonomically corresponding to EMCR-CoV 
virus, and generally will be encoded by a nucleic acid sequence structurally 
corresponding with a sequence as shown in figure 1. Herewith the invention provides 
10 virus phylogenetically corresponding to the isolated virus of which the sequences are 
depicted in figure 1. 

It should be noted that, similar to other viruses, a certain degree of variation ct 
be expected to be found between EMCR-CoV-viruses isolated from different sources. 

Also, the viral sequence of the EMCR-CoV virus or an isolated EMCR-CoV viru 
1 5 gene as provided herein for example shows less than 95%, preferably less than 90%, 

more preferably less than 80%, more preferably less than 70% and most preferably lee 
than 65% nucleotide sequence homology or less than 95%, preferably less than 90%, 
more preferably less than 80%, more preferably less than 70% and most preferably lea 
than 65% amino acid sequence homology with the respective nucleotide or amino acid 

2 0 sequence of the human coronavirus 299E or Porcine Epidemic Diarrhea Virus as for 

example can be found in Genbank (for example in accession number af304460 (HCoV 
299E) or af353511 (PEDV). 

Sequence divergence of EMCR-CoV strains around the world may be somewhat 
higher, in analogy with other coronaviruses. 
25 The term "nucleotide sequence homology" as used herein denotes the presence o 

homology between two (polynucleotides. Polynucleotides have "homologous" sequences 
if the sequence of nucleotides in the two sequences is the same when aligned for 
maximum correspondence. Sequence comparison between two or more polynucleotides 
generally performed by comparing" portions of the two sequences over a comparison 

3 0 window to identify and compare local regions of sequence similarity. The comparison 

window is generally from about 20 to 200 contiguous nucleotides. The "percentage of 
sequence homology" for polynucleotides, such as 50, 60, 70, 80, 90, 95, 98, 99 or 100 
percent sequence homology may be determined by comparing two optimally aligned 
sequences over a comparison window, wherein the portion of the polynucleotide 
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sequence in the comparison window may include additions or deletions (i.e. gaps) as 
compared to the reference sequence (which does not comprise additions or deletions) for . 
optimal alignment of the two sequences. The percentage is calculated by: (a) 
determining the number of positions at which the identical nucleic add base occurs in 
5 both sequences to yield the number of matched positions; (b) dividing the number of 
matched positions by the total number of positions in the window of comparison; and (c) 
multiplying the result by 100 to yield the percentage of sequence homology. Optimal 
alignment of sequences for comparison may be conducted by computerized 
implementations of known algorithms, or by inspection. Readily available sequence 
1 0 comparison and multiple sequence alignment algorithms are, respectively, the Basic 

Local Alignment Search Tool (BLAST) (Altschul, S.F. et al. 1990. J. Mol. Biol. 215:403; 
Altschul, S.F. et al. 1997. Nucleic Acid Res. 25:3389-3402) and ClustalW programs both 
available on the internet. Other suitable programs include GAP, BESTFIT and FASTA 
in the Wisconsin Genetics Software Package (Genetics Computer Group (GCG), 

15 Madison, WI, USA). 

As used herein, "substantially complementary" means that two nucleic acid sequences 
have at least about 65%, preferably about 70%, more preferably about 80%, even more 
preferably 90%, and most preferably about 98%, sequence complementarity to each 
other. This means that the primers and probes must exhibit sufficient complementarity 
20 to their template and target nucleic acid, respectively, to hybridise under stringent 

conditions. Therefore, the primer sequences as disclosed in this specification need not 
reflect the exact sequence of the binding region on the template and degenerate primers 
can be used. A substantially complementary primer sequence is one that has sufficient 
sequence complementarity to the amplification template to result in primer binding and 
25 second-strand synthesis. 

The term "hybrid" refers to a double-stranded nucleic acid molecule, or duplex, 
formed by hydrogen bonding between complementary nucleotides. The terms "hybridise' 
or "anneal" refer to the process by which single strands of nucleic acid sequences form 
double-helical segments through hydrogen bonding between complementary nucleotides 
3 o The term "oligonucleotide" refers to a short sequence of nucleotide monomers 

(usually 6 to 100 nucleotides) joined by phosphorous linkages (e.g., phosphodiester, alky 
and aryl-phosphate, phosphorothioate), or non-phosphorous linkages (e.g., peptide, 
sulfamate and others). An oligonucleotide may contain modified nucleotides having 
modified bases (e.g., 5-methyl cytosine) and modified sugar groups (e.g., 2'-0-methyl 
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ribosyl, 2'-0-methoxyethyl ribosyl, 2'-fLuoro ribosyl, 2'-amino ribosyl, and the like). 
Oligonucleotides may be naturaUy-occurring or synthetic molecules of double- and 
single-stranded DNA and double- and single-stranded RNA with circular, branched oj 
linear shapes and optionally including domains capable of forming stable secondary 
5 structures (e.g., stem-and-loop and loop-stem-loop structures). 

The term "primer" as used herein refers to an oligonucleotide which is capable 
annealing to the amplification target allowing a DNA polymerase to attach thereby 
serving as a point of initiation of DNA synthesis when placed under conditions in whi< 
synthesis of primer extension product which is complementary to a nucleic acid strant 

10 is induced, i.e. ( in the presence of nucleotides and an agent for polymerization such as 
DNA polymerase and at a suitable temperature and pH. The (amplification) primer is 
preferably single stranded for maximum efficiency in amplification. Preferably, the 
primer is an oligodeoxy ribonucleotide. The primer must be sufficiently long to prime t 
synthesis of extension products in the presence of the agent for polymerization. The 

1 5 exact lengths of the primers will depend on many factors, including temperature and 
source of primer. A "pair of bi-directional primers" as used herein refers to one forwar< 
and one reverse primer as commonly used in the art of DNA amplification such as in 
PCR amplification. 

The term "probe" refers to a single-stranded oligonucleotide sequence that will 
recognize and form a hydrogen-bonded duplex with a complementary sequence in a 
target nucleic acid sequence analyte or its cDNA derivative. 

The terms "stringency" or "stringent hybridization conditions" refer to 
hybridization conditions that affect the stability of hybrids, e.g., temperature, salt 
concentration, pH, formamide concentration and the like. These conditions are 
empirically optimised to maximize specific binding and minimize non-specific binding < 
primer or probe to its target nucleic acid sequence. The terms as used include referena 
to conditions under which a probe or primer will hybridise to its target sequence, to a 
detectably greater degree than other sequences (e.g. at least 2-foldover background). 
S^g^Tnt o6nai«bns are sequence dependent and wfflbe'difierent in different 
3 0 circumstances. Longer sequences hybridise specifically at higher temperatures. 

Generally, stringent conditions are selected to be about 5°C lower than the thermal 
melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tn 
is the temperature (under defined ionic strength and pH) at which 50% of a 
complementary target sequence hybridises to a perfectly matched probe or primer. 
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Typically, stringent conditions will be those in which the salt concentration is less than 
about 1.0 M Na+ ion, typically about 0.01 to 1.0 M Na+ ion concentration (or other salts) 
at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes or primers 
(e.g. 10 to 50 nucleotides) and at least about 60°C for long probes or primers (e.g. greater 

5 than 50 nucleotides). Stringent conditions may also be achieved with the addition of 
destabilizing agents such as formamide. Exemplary low stringent conditions or 
"conditions of reduced stringency" include hybridization with a buffer solution of 30% 
formamide, 1 M NaCl, 1% SDS at 37°C and a wash in 2x SSC at 40°C. Exemplary high 
stringency conditions include hybridization in 50% formamide, 1 M NaCl, 1% SDS at 

10 37°C, and a wash in O.lx SSC at 60°C. Hybridization procedures are well known in the 
art and are described in e.g. Ausubel et al, Current Protocols in Molecular Biology, John 

Wiley & Sons Inc., 1994. 

The term "antibody" includes reference to antigen binding forms of antibodies (e. 
g., Fab, F (ab) 2). The term "antibody" frequently refers to a polypeptide substantially 
1 5 encoded by an immunoglobulin gene or immunoglobulin genes, or fragments thereof 
which specifically bind and recognize an ahalyte (antigen). However, while various 
antibody fragments can be defined in terms of the digestion of an intact antibody, one of 
skill will appreciate that such fragments may be synthesized de novo either chemically 
or by utilizing recombinant DNA methodology. Thus, the term antibody, as used herein, 
2 0 also includes antibody fragments such as single chain Fv, chimeric antibodies (i. e., 

comprising constant and variable regions from different species), humanized antibodies 
(i. e., comprising a . complementarity determining region (CDR) from a non-human 
source) and heteroconjugate antibodies (e. g., bispecific antibodies). 

In short, the invention provides an isolated essentially mammalian positive- 

2 5 sense single stranded RNA virus (EMCR-CoV) belonging to the Coronaviruses and 

identifiable as phylogenetically corresponding thereto by determining a nucleic acid 
sequence of a suitable fragment of the genome of said virus and testing it in 
phylogenetic tree analyses wherein maximum likelihood trees are generated using 100 
bootstraps and 3 jumbles and finding it to.be more closely phylogenetically 

3 0 corresponding to a virus isolate having the sequences as depicted in figure 1 than it is 

corresponding to a virus isolate of PEDV (porcine epidemic diarrhea virus), HCoV-229E 
(human coronavirus 229E), PRCoV (porcine respiratory coronavirus), TGEV 
(transmissible gastroenteritis virus), CaCoV (Canine coronavirus) andFeCoV (feline 
coronavirus). 
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Suitable nucleic acid genome fragments each useful for such phylogenetic tree 
analyses are for example any of the fragments encoding the Matrix protein or the 
Nucleocapsid protein as disclosed in Figure 1, leading to the phylogenetic tree analys: 
as disclosed herein in figure 2a or 2b. Other suitable nucleic acid fragments useful foj 
5 such phylogenetic tree analyses are for example any of the fragments encoding Replic 
la and lb, Spike, orf 4a and 4b, and E. 

A suitable open reading frame (ORF) useful in phylogenetic analyses comprise 
the ORF encoding the viral replicase (ORF la). When an overall amino acid identity o 
at least 60%, preferably of at least 70%, more preferably of at least 80%, more prefers 
10 of at least 90%, most preferably of at least 96% of the analysed replicase with the 
replicase having a sequence comprising the amino acids of Figure 1 is found, the 
analysed virus isolate comprises an EMCR-CoV virus isolate according to the inventic 
A suitable open reading frame (ORF) useful in phylogenetic analyses comprise! 
the ORF encoding the viral replicase (ORF lb). When an overall amino add identity o: 
15 at least 82%, more preferably of at least 90%, most preferably of at least 95% of the 

analysed replicase with the replicase having a sequence comprising the amino acids oi 
Figure 1 is found, the analysed virus isolate comprises an EMCR-CoV virus isolate 
according to the invention. 

Another suitable open reading frame (ORF) useful in phylogenetic analyses 
2 0 comprises the ORF encoding the Nucleocapsid protein. When an overall amino acid 

identity of at least 50%, more preferably of at least 60%, more preferably of at least 70 
more preferably of at least 80%, more preferably of at least 90%, most preferably of at 
least 95% of the analysed Nucleocapsid protein with the Nucleocapsid protein encoded 
by a sequence comprising (part of) the sequence F of Figure 1 is found, the analysed 
2 5 virus isolate comprises an EMCR-CoV isolate according to the invention. 

Another suitable open reading frame (ORF) useful in phylogenetic analyses 
comprises the ORF encoding the Matrix protein. When an overall amino acid identity < 
at least 60%, more preferably of at least 70%, more preferably of at least 80%, more 
preferably of at least 90%, most preferably "of at least 95% of the analysed Matrix 
protein with the Matrix protein encoded by a sequence comprising (part of) the sequen- 
F of Figure 1 is found, the analysed virus isolate comprises an EMCR-CoV isolate 
according to the invention. 

Another suitable open reading frame (ORF) useful in phylogenetic analyses 
comprises the ORF encoding the spike protein S. When an overall amino acid identity ( 
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at least 55%, more preferably of at least 60%, more preferably of at least 70%, more 
preferably of at least 80%, more preferably of at least 90%, most preferably of at least 
95% of the analysed S-protein encoded by a sequence comprising the sequence of 
translation 2 of E and translation 1 of the F sequence of the S-protein as depicted in 
Figure 1 is found, the analysed virus isolate comprises an EMCR-CoV virus isolate 
according to the invention. The S ORF of the EMCR-CoV virus seems to be located 
adjacent to the ORF lab (coding for the viral replicase), which would discriminate an 
EMCR-CoV viruses from the bovine coronavirus and the murine hepatitis virus, which 
have a so-called 2a gene and an HE-gene between the S protein and the viral 
polymerase. 

The invention provides among others an isolated or recombinant nucleic acid or 
virus-specific functional fragment thereof obtainable from a virus according to the 
invention. The isolated or recombinant nucleic acids comprises the sequences as given in 
figure 1 or sequences of homologues which are able to hybridise with those under 
stringent conditions. In particular, the invention provides primers and/or probes 
suitable for identifying an EMCR-CoV virus nucleic acid. 

Furthermore, the invention provides a vector comprising a nucleic acid according 
to the invention. To begin with, vectors such as plasmid vectors containing (parts of) the 
genome of the EMCR-CoV virus, virus vectors containing (parts of) the genome of the 
EMCR-CoV (for example, but not limited thereto, vaccinia virus, retroviruses, 
baculovirus), or EMCR-CoV virus containing (parts of) the genome of other viruse or 
other pathogens are provided. 

Also, the invention provides a host cell comprising a nucleic acid or a vector 
according to the invention. Plasmid or viral vectors containing the replicase components 
of EMCR-CoV virus are generated in prokaryotic cells for the expression of the 
components in relevant cell types (bacteria, insect cells, eukaryotic cells). Plasmid or 
viral vectors containing full-length or partial copies of the EMCR-CoV virus genome will 
be generated in prokaryotic cells for the expression of viral nucleic acids in- vitro or in- 
vivo. The latter vectors may contain other viral sequences for the generation of chimeric 
viruses or chimeric virus proteins, may lack parts of the viral genome for the generation 
of replication defective virus, and may contain mutations, deletions or insertions for the 
generation of attenuated viruses. 
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Infectious copies of EMCR-CoV virus (being wild type, attenuated, replication- 
defective or chimeric) can be produced upon co-expression of the polymerase componei 
according to the state-of-the-art technologies described above. 

In addition, eukaryotic cells, transiently or stably expressing one or more full- 
length or partial EMCR-CoV virus proteins can be used. Such cells can be made by 
transfection (proteins or nucleic acid vectors), infection (viral vectors) or transduction 
(viral vectors) and may be useful for complementation of mentioned wild type, 
attenuated, repUcation-defective or chimeric viruses. 

A chimeric virus may be of particular use for the generation of recombinant 
vaccines protecting against two or more viruses. For example, it can be envisaged that 
EMCR-CoV virus vector expressing one or more proteins of a human metapneumoviru 
or a human metapneumovirus vector expressing one or more proteins of EMCR-CoV 
virus will protect individuals vaccinated with such vector against both virus infections 
Such a specific chimeric virus is particularly useful in the invention because it is 
suspected that co-infection of, for instance, human metapneumovirus frequently occur! 
in coronavirus infected patients. Attenuated and replication-defective viruses may be c 
use for vaccination purposes with live vaccines as has been suggested for other viruses 

In a preferred embodiment, the invention provides a proteinaceous molecule or 
coronavirus-specific viral protein or functional fragment thereof encoded by a nucleic 
acid according to the invention. Useful proteinaceous molecules are for example derive 
from any of the genes or genomic fragments derivable from a virus according to the 
invention. Such molecules, or antigenic fragments thereof, as provided herein, are for 
example useful in diagnostic methods or kits and in pharmaceutical compositions such 
as sub-unit vaccines and inhibitory peptides. Particularly useful are the viral replicase 
protein, the spike protein, the matrix protein, the nucleocapsid or antigenic fragments 
thereof for inclusion as antigen or subunit immunogen, but inactivated whole virus caj 
also be used. Particulary useful are also those proteinaceous substances that are 
encoded by recombinant nucleic acid fragments that are identified for phylogenetic 
analyses; of course preferred are those" that are within the preferred bounds and metes 
of ORPs useful in phylogenetic analyses, in particular for eliciting EMCR-CoV virus 
specific antibodies, whether in vivo (e.g. for protective puposes or for providing 
diagnostic antibodies) or in vitro (e.g. by phage display technology or another technique 
useful for generating synthetic antibodies). 
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Also provided herein are antibodies, be it natural polyclonal or monoclonal, or 
synthetic (e.g. (phage) library-derived binding molecules) antibodies that specifically 
react with an antigen comprising a proteinaceous molecule or EMCR-CoV virus-specific 
functional fragment thereof according to the invention. Such antibodies are useful in a 
method for identifying a viral isolate as an EMCR-CoV virus comprising reacting said 
viral isolate or a component thereof with an antibody as provided herein. This can for 
example be achieved by using purified or non-purified EMCR-CoV virus or parts thereof 
(proteins, peptides) using ELISA, RIA, FACS or similar formats of antigen detection 
assays (Current Protocols in Immunology). Alternatively, infected cells or cell cultures 
) may be used to identify viral antigens using classical immunofluorescence or 

immunohistochemical techniques. Specifically useful in this respect are antibodies 
raised against EMCR-CoV virus proteins which are encoded by a nucleotide sequence 
comprising one or more of the sequences disclosed in figure 1. 

Other methods for identifying a viral isolate as an EMCR-CoV virus comprise 
5 reacting said viral isolate or a component thereof with a virus specific nucleic acid 
according to the invention. 

In this way the invention provides a viral isolate identifiable with a method 
according to the invention as a mammalian virus taxonomically corresponding to a 
positive-sense single stranded RNA virus identifiable as likely belonging to the EMCR- 
0 CoV virus genus within the family of Coronaviruses. 

The method is useful in a method for virologically diagnosing an EMCR-CoV 
virus infection of a mammal, said method for example comprising determining in a 
sample of said mammal the presence of a viral isolate or component thereof by reacting 
said sample with a nucleic acid or an antibody according to the invention. 
5 Methods of the invention can in principle be performed by using any nucleic acid 

amplification method, such as the Polymerase Chain Reaction (PCR; Mullis 1987, U.S. 
Pat. No. 4,683,195, 4,683,202, en 4,800,159) or by using amplification reactions such as 
IAgase Chain Reaction (LCR; Barany 1991, Proc. Natl. Acad. Sci. USA 88:189-193; EP 
Appl. No., 320,308), Self-Sustained Sequence Replication (3SR; Guatelli et al., 1990, 
> 0 Proc. Natl. Acad. Sci. USA 87:1874-1878), Strand Displacement Amplification (SDA; 
U.S. Pat. Nos. 5,270,184, en 5,455,166), Transcriptional Amplification System (TAS; 
Kwoh et al., Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (IAzardi et al., 
1988, Bio/Technology 6:1197), Rolling Circle Amplification (RCA; U.S. Pat. No. 
5,871,921), Nucleic Acid Sequence Based Amplification (NASBA), Cleavase Fragment 
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Length Polymorphism (U.S. Pat. No. 5,719,028), Isothermal and Chimeric Primer- 
initiated Amplification of Nucleic Acid (ICAN), Ramification-extension Amplification 
Method (RAM; U.S. Pat. Nos. 5,719,028 and 5,942,391) or other suitable methods for 
amplification of nucleic acids. 

In order to amplify a nucleic acid with a small number of mismatches to one or more 
the amplification primers, an amplification reaction may be performed under conditio 
of reduced stringency (e.g. a PGR amplification using an annealing temperature of 
38°C, or the presence of 3.5 mM MgC12). The person skilled in the art will be able to 
select conditions of suitable stringency. 

The primers herein are selected to be "substantially" complementary (i.e. at le 
65%, more preferably at least 80% perfectly complementary) to their target regions 
present on the different strands of each specific sequence to be amplified. It is possib 
to use primer sequences containing e.g. inositol residues or ambiguous bases or even 
primers that contain one or more mismatches when compared to the target sequence 
general, sequences that exhibit at least 65%, more preferably at least 80% homology 
with the target DNA or RNA oligonucleotide sequences, are considered suitable for u 
in a method of the present invention. Sequence mismatches are also not critical whej 
using low stringency hybridization conditions. 

The detection of the amplification products can in principle be accomplished b 
any suitable method known in the art. The detection fragments may be directly stair 
or labelled with radioactive labels, antibodies, luminescent dyes, fluorescent dyes, or 
enzyme reagents. Direct DNA stains include for example intercalating dyes such as 
acridine orange, ethidium bromide, ethidium monoazide or Hoechst dyes. 
Alternatively, the DNA or ENA fragments may be detected by incorporation of labell 
dNTP bases into the synthesized fragments. Detection labels which may be associate 
with nucleotide bases include e.g. fluorescein, cyanine dye or BrdUrd. 

When using a probe-based detection system, a suitable detection procedure fo 
use in the present invention may for example comprise an enzyme immunoassay (EI 
format (Jacobs et al.-, 1997; J. Clin; MicrobioL 35, 791-795). For performing a detectic 
by manner of the EIA procedure, either the forward or the reverse primer used in th< 
amplification reaction may comprise a capturing group, such as a biotin group for 
immobilization of target DNA PGR amplicons on e.g. a strep tavidin coated microtiteD 
plate wells for subsequent EIA detection of target DNA -amplicons (see below). The 
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skilled person will understand that other groups for immobilization of target DNA PCR 
amplicons in an EIA format may be employed. 

Probes useful for the detection of the target DNA as disclosed herein preferably 
bind only to at least a part of the DNA sequence region as amplified by the DNA 
amplification procedure. Those of skill in the art can prepare suitable probes for 
detection based on the nucleotide sequence of the target DNA without undue 
experimentation as set out herein. Also the complementary nucleotide sequences, 
whether DNA or RNA or chemically synthesized analogs, of the target DNA may 
suitably be used as type-specific detection probes in a method of the invention, provided 
that such a complementary strand is amplified in the amplification reaction employed. 

Suitable detection procedures for use herein may for example comprise 
immobihzation of the amplicons and probing the DNA sequences thereof by e.g. 
southern blotting. Other formats may comprise an EIA format as described above. To 
facilitate the detection of binding, the specific amplicon detection probes may comprise s 
label moiety such as a fLuorophore, a chromophore, an enzyme or a radio-label, so as to 
facilitate monitoring of binding of the probes to the reaction product of the amplification 
reaction. Such labels are well-known to those skilled in the art and include, for example 
fluorescein isothiocyanate (FITC), P-galactosidase, horseradish peroxidase, streptavidin 
biotin, digoxigenin, 35S or 1251. Other examples will be apparent to those skilled in the 
art. 

Detection may also be performed by a so called reverse line blot (RUB) assay, 
such as for instance described by Van den Brule et al. (2002, J. Clin. Microbiol. 40, 
779-787). For this purpose BLR probes are preferably synthesized with a 5' amino group 
for subsequent immobilization on e.g. carboxyl-coated nylon membranes. The advantage 
of an RLB format is the ease of the system and its speed, thus allowing for high 
throughput sample processing. 

The use of nucleic acid probes for the detection of RNA or DNA fragments is well 
known in the art. Mostly these procedure comprise the hybridization of the target 
nucleic acid with the probe followed by post-hybridization washings. Specificity is 
typically the function of post-hybridization washes, the critical factors being the ionic 
strength and temperature of the final wash solution. For nucleic acid hybrids, the Tm 
can be approximated from the equation of Meinkoth and Wahl, Anal. Biochem., 138: 
267-284 (1984): Tm = 81.5 °C + 16.6 (log M) + 0.41 (% GC)-0.61 (% form)-500/L; where IV 
is the molarity of monovalent cations, % GC is the percentage of guanosine and cytosine 
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nucleotides in the nucleic acid, % form is the percentage of formamide in the 
hybridization solution, and L is the length of the hybrid in base pairs. The Tm is the 
temperature (under defined ionic strength and pH) at which 50% of a complementary 
target sequence hybridizes to a perfectly matched probe. Tm is reduced by about 1 °C 
each 1 % of mismatching; thus, the hybridization and/or wash conditions can be 
adjusted to hybridize to sequences of the desired identity. For example, if sequences 
with > 90% identity are sought, the Tm can be decreased 10°C. Generally, stringent 
conditions are selected to be about 5 °C lower than the thermal melting point (Tm) for 
the specific sequence and its complement at a defined ionic strength and pH. However 
severely stringent conditions can utilize a hybridization and/or wash at 1,2,3, or 4 °C 
lower than the thermal melting point (Tm); moderately stringent conditions can utilize 
hybridization and/or wash at 6, 7, 8, 9, or 10 °C lower than the thermal melting point 
(Tm); low stringency conditions can utilize a hybridization and/or wash at 11, 12, 13, l 
15, or 20 °C lower than the thermal melting point (Tm). Using the equation, 
hybridization and wash compositions, and desired Tm, those of ordinary skill will 
understand that variations in the stringency of hybridization and/or wash solutions ar- 
inherently described. If the desired degree of mismatching results in a Tm of less than 
45 °C (aqueous solution) or 32 "C (formamide solution) it is preferred to increase the 
SSC concentration so that a higher temperature can be used. An extensive guide to the 
hybridization of nucleic acids is found in Tijssen, Laboratory Techniques in Biochemist 
and Molecular Biology— Hybridization with Nucleic Acid Probes, Part I, Chapter 2" 
Overview of principles of hybridization and the strategy of nucleic acid probe assays", 
Elsevier. New York (1993); and Current Protocols in Molecular Biology, Chapter 2, 
Ausubel, et al., Eds., Greene Publishing and Wiley-Interscience, New York (1995).' 

In another aspect, the invention provides oligonucleotide probes for the generic 
detection of target RNA or DNA The detection probes herein are selected to be 
"substantially" complementary to one of the strands of the double stranded nucleic acid 
generated by an amplification reaction of the invention. Preferably the probes are 
substantially complementary to the immobiHzable, e.g. biotin labelled, antisense stranc 
of the amplicons generated from the target RNA or DNA 

It is allowable for detection probes of the present invention to contain one or 
more mismatches to their target sequence. In general, sequences that exhibit at least 
65%, more preferably at least 80% homology with the target oligonucleotide sequences 
are considered suitable for use in a method of the present invention. 
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Antibodies, both monoclonal and polyclonal, can also be used for detection 
purpose in the present invention, for example, in immunoassays in which they can be 
utilized in Equid phase or bound to a solid phase carrier. In addition, the monoclonal 
antibodies in these immunoassays can be detectably labeled in various ways. A variety 
of immunoassay formats maybe used to select antibodies specifically reactive with a 
particular protein (or other analyte). For example, solid-phase ELISA immunoassays are 
routinely used to select monoclonal antibodies specifically immunoreactive with a 
protein. See Harlow and Lane, Antibodies, A Laboratory Manual, Cold Spring Harbor 
Publications, New York (1988), for a description of immunoassay formats and conditions 
that can be used to determine selective binding. Examples of types of immunoassays 
that can utilize antibodies of the invention are competitive and non-competitive 
immunoassays in either a direct or indirect format. Examples of such immunoassays are 
the radioimmunoassay (RIA) and the sandwich (immunometric) assay. Detection of the 
antigens using the antibodies of the invention can be done utilizing immunoassays that 
are run in either the forward, reverse, or simultaneous modes, including 
immunohistochemical assays on physiological samples. Those of skill in the art will 
know, or can readily discern, other immunoassay formats without undue 
experimentation. 

Antibodies can be bound to many different carriers and used to detect the 
presence of the target molecules. Examples of well-known carriers include glass, 
polystyrene, polypropylene, polyethylene, dextran, nylon, amylases, natural and 
modified celluloses, polyacrylamides, agaroses and magnetite. The nature of the carrier 
can be either soluble or insoluble for purposes of the invention. Those skilled in the art 
will know of other suitable carriers for binding monoclonal antibodies, or will be able to 
ascertain such using routine experimentation. 

The invention also provides a method for serologically diagnosing an EMCR-CoV 
virus infection of a mammal comprising determining in a sample of said mammal the 
presence of an antibody specifically directed against an EMCR-CoV virus or component 
thereof by reacting said sample with a proteinaceous molecule or fragment thereof or ai 
antigen according to the invention 

Methods and means provided herein are particularly useful in a diagnostic kit fc 
diagnosing an EMCR-CoV virus infection, be it by virological or serological diagnosis. 
Such kits or assays may for example comprise a virus, a nucleic acid, a proteinaceous 
molecule or fragment thereof, an antigen and/or an antibody according to the invention 
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Use of a virus, a nucleic acid, a proteinaceous molecule or fragment thereof, an 
antigen and/or an antibody according to the invention is also provided for the product* 
of a pharmaceutical composition, for example for the treatment or prevention of EMC! 
CoV virus infections and/or for the treatment or prevention of atypical pneumonia, in 
particular in humans. Preferably a peptide comprising part of the amino acid sequenc 
of the spike protein as depicted in the relevant translations of Figure 1, is used for the 
preparation of a therapeutic or prophylactic peptide. Also preferably, a protein 
comprising the amino acid sequence of the spike protein as depicted in the relevant 
translations of Figure 1, is used for the preparation of a sub-unit vaccine. Furthermoi 
the nucleocapsid of Coronaviruses, as depicted in the translation of Figure 1, is known 
to be particularly useful for eliciting cell-mediated immunity against Coronaviruses a* 
can be used for the preparation of a sub-unit vaccine. 

Attenuation of the virus can be achieved by established methods developed for 
this purpose, including but not limited to the use of related viruses of other species, 
serial passages through laboratory animals or/and tissue/cell cultures, serial passages 
through cell cultures at temparutes below 37°C (cold-adaption), site directed 
mutagenesis of molecular clones and exchange of genes or gene fragments between 
related viruses. 

A pharmaceutical composition comprising a virus, a nucleic acid, a proteinaceoi 
molecule or fragment thereof, an antigen and/or an antibody according to the inventior 
can for example be used in a method for the treatment or prevention of an EMCR-CoV 
virus infection and/or a respiratory illness comprising providing an individual with a 
pharmaceutical composition according to the invention. This is most useful when said 
individual comprises a human. Antibodies against EMCR-CoV virus proteins, especial] 
against the spike protein of EMCR-CoV virus, preferably against the amino acid 
sequence as depicted in the translation in figure 1, are also useful for prophylactic or 
therapeutic purposes, as passive vaccines. It is known from other coronaviruses that tl 
spike protein is a very strong antigen and that antibodies against spike protein can be 
used in prophylactic and therapeutic vaccination. 

The invention also provides method to obtain an antiviral agent useful in the 
treatment of atypical pneumonia comprising establishing a cell culture or experimental 
animal comprising a virus according to the invention, treating said culture or animal 
with an candidate antiviral agent, and determining the effect of said agent on said viru 
or its infection of said culture or animaL An example of such an antiviral agent 
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comprises an EMCR-CoV virus-neutralising antibody, or functional component thereof, 
as provided herein, but antiviral agents of other nature are obtained as well. 

The invention also provides use of an antiviral agent according to the invention 
for the preparation of a pharmaceutical composition, in particular for the preparation of 
a pharmaceutical composition for the treatment of atypical pneumonia, especifically 
when caused by an EMCR-CoV virus infection, and provides a pharmaceutical 
composition comprising an antiviral agent according to the invention, useful in a method 
for the treatment or prevention of an EMCR-CoV virus infection or atypical pneumonia, 
said method comprising providing an individual with such a pharmaceutical 
composition. 

The invention also comprises an animal model usable for testing of prophylactic 
and/or therapeutic methods and/or preparations. It is hypothesized that apes can be 
infected with the EMCR-CoV virus, thereby showing clinical symptoms, and more 
importantly, similar tissue morphology as found in humans suffering from atypical 
pneumonia caused by the EMCR-CoV virus. Subjecting apes to a prophylactic or 
therapeutic treatment either before or during infection with the virus will have a good 
and useful predictionary value for application of such a prophylaxis or therapy in 
human subjects. 

The invention is further explained in the Examples without limiting it thereto. 
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Figure legends 



Fig. 1: Nucleotide sequences from parts of the EMCR-CoV virus. Also included aretht 
putative amino acid sequences of polypeptides. 

Fig. 2: Phylogenetic relationship for the nucleotide sequences of isolate EMCR-CoV wi 
its closest relatives genetically. Phylogenetic trees were generated by maximum 
likelihood analyses using 100 bootstraps and 3 jumbles. The scale representing the 
number of nucleotide changes is shown for each tree. Fjgurela, Maximum likelihood 
tree of matrix gene nucleotide sequences. Numbers in trees represent bootstrap value* 
The scale bar roughly reflects 10 % nucleotide differences between related sequences. 
Figure lb. Maximum likelihood tree of nucleocapsid gene nucleotide sequences. 
Numbers in trees represent bootstrap values. The scale bar roughly reflects 10 % 
nucleotide differences between related sequences. 

Fig. 3: Similarity matrices indicating amino acid identity for the putative Replicase la 
Replicase lb, Replicase lab, Spike, Orf E, Matrix and Nucleocapsid proteins 3a-g, 
respectively), and for the putative Matrix protein and Nucleoprotein (3h and 3i reap.) 
between the EMCR-CoV virus and closely related coronaviruses. See text for 
abbreviations. 

Figure 4 Alignments with various coronaviruses: 5'untranslated region genomic 
sequence (a); Putative orf la amino acid sequence (b); Putative orf lb amino acid 
sequence (c); Putative orf lab amino acid sequence (d); Putative Spike amino acid 
sequence (e); Putative orf 4a amino acid sequence (f); Putative orf 4ab amino acid 
sequence (g); Putative orf E amino acid sequence (h); Putative Matrix amino acid 
sequence (i); Putative Nucleoprotein amino acid sequence (j); Putative 3'untranslated 
genomic sequence (k); See text for abbreviations. 
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Examples 
Specimen collection 

Virus was collected from an 8 month old patient suffering from pneumonia using nasal . 
swabs. 

Virus isolation and culture 

Throat swabs were dipped into a culture of tMK cells and passaged four times. Virus 
was then in Vero-118 cells. One litre of virus containing cell culture supernatant was 
harvested, and the virus was pelleted in an ultracentrifuge and the virus pellet was 
resuspended inlml PBS. 

RNA isolation 

RNA was isolated from the supernatant of infected cell cultures or sucrose gradient 
fractions using a High Pure RNA Isolation kit according to instructions from the 
manufacturer (Roche Diagnostics, Almere, The Netherlands). 

Sequencing 

Purified RNA was sent to BaseClear holding BV (Lfciden, The Netherlands) for 
sequencing. 

Phylopenetic analyses 

Nucleotide sequences were aligned using Clustal W running under BioEdit version 
5.0.9. Maximum likelihood trees were created using the Seqboot and DNA-ML packages 
of Phylip 5.6 using 100 bootstraps and 3 jumbles. The consensus trees were calculated 
using the Consense package of phylip 5.6. These consensus trees were used as usertree 
in DNA-ML to recalculate the branch lengths from the original sequences. 

The sequences of EMCR-CoV were compared with those of reference viruses 
0 representing each species in the four groups of coronaviruses. These were: human 

coronavirus 229E (229E), af304460; porcine epidemic diarrhea virus (PEDV) af353511; 
transmissible gastroenteritis virus (TGEV), aj271965; bovine coronavirus (BoCoV), 
af220295; murine hepatitis virus (MHV), af201929; avian infectious bronchitis virus 
(AIBV), m95169, Canine coronavirus (CaCoV), dl3096; feline coronavirus (FeCoV), 
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ay204704; porcine respiratory coronavirus (PRCoV), z24675; human coronavirus OC4 
(OC43), m76373, 114643, m933990; porcine haemagglutinating encephalomyelitis vin 
(HEV), ay078417; rat coronavirus (RtCoV) af 207551) References for the viruses are t 
numbers of the NCBI catalog (http://www.ncbinhn.nih.gov/entrez/). 

5 

In general, coronaviruses, such as EMCR-CoV can be isolated and identified accordini 
to the following protocol: 
Specimen collection 

In order to find virus isolates nasopharyngeal aspirates, throat and nasal swabs, 
1 0 broncheo alveolar lavages, serum and plasma samples, and stools preferably from 
mammals such as humans, carnivores (dogs, cats, musteUits, seals etc.), horses, 
ruminants (cattle, sheep, goats etc.), pigs, rabbits, birds (poultry, ostriches, etc) shouL 
be examined. From birds cloaca swabs and droppings can be examined as well. Sera 
should be collected for immunological assays, such as ELISA, molecular-based assays, 
15 such as RT-PCR and virus neutralisation assays. 

Collected virus specimens may be diluted with 5 ml Dulbecco MEM medium 
(BioWhittaker, Walkersville, MD) and thoroughly mixed on a vortex mixer for one 
minute. The suspension is thus centrifuged for ten minutes at 840 x g. The sediment i 
spread on a multispot slide (Nutacon, Leimuiden, The Netherlands) for 
2 0 immunofluorescence techniques, and the supernatant is used for virus isolation. 

Virus isolation 

For virus isolation Vero-118 cells or tMK cells (RIVM, Bilthoven, The Netherlands) we 
cultured in 24 well plates containing glass slides (Costar, Cambridge, UK), with the 
2 5 medium described below supplemented with 10% fetal bovine serum (BioWhittaker, 
Vervier, Belgium). Before inoculation the plates were washed with PBS and supplied 
with Eagle's MEM with Hanks' salt (ICN, Costa mesa, CA) supplemented with 0.52/ht 
gram NaHCOa , 0.025 M Hepes (Biowhittaker), 2 mM L-glutamine (Biowhittaker), 200 
units/liter penicilline, 200 ug/hter strepfomycihe (Bi6wHttaker),lgram/liter " 
lactalbumine (Sigma-Aldrich, Zwijndrecht, The Netherlands), 2.0 gram/liter D-glucose 
(Merck, Amsterdam, The Netherlands), 10 gram/liter peptone (Oxoid, Haarlem, The 
Netherlands) and 0.02% trypsine (Life Technologies, Bethesda, MD). The plates were 
inoculated with supernatant of the patient samples, 0,2 ml per well in triplicate, 
followed by centrifuging at 840x g for one hour. After inoculation the plates were 
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incubated at 37 °C for 1-7 days and cultures were checked daily for CPE. Extensive CPE 
was generally observed within 5-10 and included detachment of cells from the 
monolayer.. 

Virus culture 

Sub-confluent monolayers of tMK cells or Vero clone 118 cells in media as described 
above were inoculated with supernatants of samples that displayed CPE or with 
samples taken from a patient. 

RNA isolation 

RNA was isolated from the supernatant of infected cell cultures or sucrose gradient 
fractions using a High Pure RNA Isolation kit according to instructions from the 
manufacturer (Roche Diagnostics, Almere, The Netherlands). RNA can also be isolated 
following other procedures known in the field (Current Protocols in Molecular Biology). 

Sequence analysis 

Sequence analyses were performed by BaseClear holding BV (Leiden, The Netherlands) 
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1. An isolated essentially mammalian positive-sense single stranded RNA virus 
(EMCR-CoV) comprising the sequence of figure 1 or homologues thereof. 

2. An isolated positive-sense single stranded RNA virus (EMCR-CoV) belonging t 
the Coronaviruses and identifiable as phylogeneticaUy corresponding thereto by 
determining a nucleic acid sequence of said virus and testing it in phylogenetic tree 
analyses wherein maximum likelihood trees are generated using 100 bootstraps and 3 
jumbles and finding it to be more closely phylogeneticaUy corresponding to a virus 
isolate having the sequences as depicted in figure 1 than it is corresponding to a virus 
isolate of PEDV (porcine epidemic diarrhea virus), HCoV-229E (human coronavirus 
229E), PRCoV (porcine respiratory coronavirus), TGEV (transmissible gastroenteritis 
virus), CaCoV (Canine coronavirus) and FeCoV (feline coronavirus). 

3. A virus according to claim 1 or 2 wherein said nucleic acid sequence comprises , 
open reading frame (ORF) encoding a viral protein of said virus. 

4. A virus according to claim 3 wherein said open reading frame is selected from t] 
group of ORPs encoding the viral replicase, nuclear capsid protein, matrix protein and 
the spike protein. 

5. A virus according to claim 1-4 isolatable from a human with atypical pneumonu 

25 6. An isolated or recombinant nucleic acid or EMCR-CoV virus-specific functional 
fragment thereof obtainable from a virus according to anyone of claims 1 to 6. 

7. A vector comprising a nucleic acid according to claim 6. 



30 8. A host cell comprising a nucleic acid according to claim 6 or a vector according tc 
claim 7. 



9. An isolated or recombinant proteinaceous molecule or EMCR-CoV virus-specific 
functional fragment thereof encoded by a nucleic acid according to claim 6. 
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10. An antigen comprising a proteinaceous molecule or EMCR-CoV virus-specific 
functional fragment thereof according to claim 9. 

11. An antibody specifically directed against an antigen according to claim 10. 

12. A method for identifying a viral isolate as an EMCR-CoV virus comprising 
reacting said viral isolate or a component thereof with an antibody according to claim 
11. 

13. A method for identifying a viral isolate as an EMCR-CoV virus comprising 
reacting said viral isolate or a component thereof with a nucleic acid according to claim 
6. 

14. A method for virologically diagnosing an EMCR-CoV infection of a mammal s 
comprising determining in a sample of said mammal the presence of a viral isolate or 
component thereof by reacting said sample with a nucleic acid according to claim 6 or an 
antibody according to claim 11. 

15. A method for serologically diagnosing an EMCR-CoV infection of a mammal 
comprising determining in a sample of said mammal the presence of an antibody 
specifically directed against an EMCR-CoV virus or component thereof by reacting said 
sample with a proteinaceous molecule or fragment thereof according to claim 9 or an 
antigen according to claim 10. 

16. A diagnostic kit for diagnosing an EMCR-CoV infection comprising a virus 
according to anyone of claims 1 to 5, a nucleic acid according to claim 6, a proteinaceous 
molecule or fragment thereof according to claim 9, an antigen according to claim 10 
and/or an antibody according to claim 11. 

17. Use of a virus according to any one claims 1 to 5, a nucleic acid according to clah 
6, a vector according to claim 7, a host cell according to claim 8, a proteinaceous 
molecule or fragment thereof according to claim 9, an antigen according to claim 10, or 
an antibody according to claim 11 for the production of a pharmaceutical composition. 
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18. Use according to claim 17 for the production of a pharmaceutical composition fi 
the treatment or prevention of an EMCR-CoV virus infection. 

5 19. Use according to claim 17 or 18 for the production of a pharmaceutical 
composition for the treatment or prevention of atypical pneumonia. 

20. A pharmaceutical composition comprising a virus according to any one of claim; 
1 to 5, a nucleic acid according to claim 6, a vector according to claim 7, a host cell 

10 according to claim 8, a proteinaceous molecule or fragment thereof according to claim J 
an antigen according to claim 10, or an antibody according to claim 11. 

21. A method for the treatment or prevention of an EMCR-CoV virus infection 
comprising providing an individual with a pharmaceutical composition according to 

15 claim 20. 

22. A method for the treatment or prevention of atypical pneumonia comprising 
providing an individual with a pharmaceutical composition according to claim 20. 

20 23. A viral rephcase encoded by an RNA sequence comprising the indicated 
sequences, or homologues thereof as depicted in figure 1. 

24. A viral spike protein comprising the indicated amino acid sequence as depicted : 
figure 1, or a homologue thereof. 

25 

25 A viral nuclear capsid protein encoded by an ENA sequence comprising the 
indicated sequence as depicted in figure 1 or a homologue thereof. 

26 - • A ns P 3 or envelope protein encoded by an RNA sequence comprising the ' 
3 0 indicated sequence as depicted in figure 1, or a homologue thereof. 

27. A nucleic acid sequence which comprises one or more of the sequences coding foi 
sepearte viral proteins as depicted in figure 1 or a nucleic acid sequence which can 
hybridise with any of these sequences under stringent conditions. 
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The invention relates to the field of virology. The invention provides a new 
isolated essentially mammalian positive-sense single stranded RNA virus 
(EMCR-CoV) within the group of coronaviuses and components thereof. 



EMCR-CoV.MPD (1 > 27532) Site and Sequence ' 0 » 

Enzymes : All 212 enzymes (No Filter) 

Settings : Circular. Certain Sites Only. Standard Genetic Code 

AGATAGAGAATTTTCTTATTTAGACTTTGTGTCTACTCCTCTCAACTAAACGAAATTTTTCTAGTGCTGTCATTTGTTATG GCAGT 

TCTATCTCTTAAAAGAATAAATCTGAAACACAGATGAGGAGAGTTGATTTGCTTTAAAAAGATCACGACAGTAAACAATACCGTCAGGAT 
1^ ™ mmmm » — — i^— MM 5-UTR — - 



GTAATTGAAATTTCGTCAAGTTTGTAAACTGGTTAGGCAAGTGTTGTATTTTCTGTGTTTAAGCACTGGTGGTTCTGTCCACTAG TGCAC, 
CATTAACTTTAAAGCAGTTCAAACATTTGACCAATCCGTTCACAACATAAAAGACACAAATTCGTGACCACCAAGACAGGTGATCACGTG- 



■5*UTRi 



ATTGATACTTAAGTGGTGTTCT GTCACTGCTTATTGTGGAAGCAACGTTCTGTCGTTGTGGAAACCAATAACTGCTAACCATGTTTTACA/ 
1 1 I 1 . 1 ■ I 1 . ■ 1 1 1 1 , , j , 1 , 1 , . . t 

TAACTATGAATTCACCACAAGACAGTGACGAATAACACCTTCGTTGCAAGACAGCAACACCTTTGGTTATTGACGATTGGTACAAAATGT1 



• 5 ' UTR — l |M F Y ; 

L Replicase 1s 

CAAGTGACACTTGCTGTTGCAAGTGATTC GGAAATTTCAGGTTTTGGTTTTGCCATTCCTTCT6TAGCCGTTCGCGCTTATAGCGAAGCCG 
' 11 I 11 1 1 ' 11 11 I I ' ■ 1 ' 1 ■ 1 ■ ■ I 1 1 1 ■ 1 1 . ■ 1 1 1 1 ■ ■ 1 1 ■ 1 , 1 1 .... . | 1 1 1 1 . 1 1 1 1 1 . 

GTTCACTGTGAACGACAACGTTCACTAAGCCTTTAAAGTCCAAAACCAAAACGGTAAGGAAGACATCGGCAAGCGCGAATATCGCTTCGGC 

Q v T L AVASDSE I SGFGFA I PSVAVRAYSEA 
" — ■ ■ Replicase 1 a — 



TGCACAAGGTTTTCAGGCATG CCGCTTTGTTGCTTTTGGCTTACAGGATTGTGTAACCGGTATTAATGATGACGATTATGTCATTGCATTG 
' 1 •' ' ' ' 1 I ■ . .. ) 1 ... 1 | , , ■ , 1 , ■ . , | |_ — 1 1 , 

ACGTGTTCCAAAAGTCCGTACGGCGAAACAACGAAAACCGAATGTCCTAACACATTGGCCATAATTACTACTGCTAATACAGTAACGTAAC 

A 0 G F Q A C RFVAFGLODCVTG I NDDDYV I A L 
Replicase 1a— - A L 



CTGG T ACTAATCAGCTTTGT GCCAAAATTTTACTTTTTTCTG ATAGACCTCTTAATTTGCGAGGTTGGCTCATTTTTT 

GACCATGATTAGTCGAAACACGGTTTTAAAATGAAAAAAGACTATCTGGAGAATTAAACGCTCCAACCGAGTAAAAAAGATTGTCGTTAAT 

T G T N Q L CAKILLFSDRPLNLRGWLIFSNSNY 
■ — Replicase 1a — _ 

GTTCTTCAGGACTTTGATGT TGTTTTTGGCCATGGTGCAGGAAGTGTGGTTTTTGTGGATAAGTATATGTGTGGTTTTGATGGTAAACrrTR 
'' 1 11 11 1 '''' ' '''' 1 11 11 1 I .... 1 .. 1 1 1 1 I ■ ■ ■ . 1 . . 1 1 1 ■ ■ 1 1 1 , . t . . . . 1 , , , - , , 

CAAGAAGTCCTGAAACTACAACAAAAACCGGTACCACGTCCTTCACACCAAAAACACCTATTCATATACACACCAAAACTACCATTTGGAC, 

V L Q D F D V VFGHGAGSVVFVDKYMCGFDGKP 

Replicase 1 a ■ - — - 



GTTAC ? TAAAAACATB TGGGAATTTAGA ^ 

:l!tu ' 1 1 '• ' ' ^l I 1 1 -. \ -.-. -— , — , - , , r -- : - - - t - . - ■- 

CAATGGATTTTTGTACACCCTTAAATCTCTAATGAAATTACTATTATGACTATCATAACAATAACCACCACAGTGAATAGTTAATCGTACCC 

LPKNMWEFRDYFNDNTDSIVIGGVTYQLAW 
Replicase 1a — ! — I u u A w 

A T?j TA y A ? GTAAAGA ??} TTcT } ATG ^ 

TACAATATGCATTTCTGGAAAGAATACTTGTCGTTTTACAAAATCGATAACTCTCGTAAGTAATAGAACCGTGATGTCCAGTATGAAACTTC 

D V I R K D LSYEQQNVLAI ESI HYLGTTGHTLK 
: — Replicase 1a - 



D1 December 2003 1 1 :52 — / w T 
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TCTGGTTGCAA ACTCATTAATQCCAAGCCGCCTAAATATTCTTCTAAGGTTGTTTTGAGTGGTGAATGGAATGCTGTGTATAAGGCGTTTGG ^ 
AGACCAACGTTTGAGTAATTACGGTTCGGCGGATTTATAAGAAGATTCCAACAAAACTCACCACTTACCTTACGACACATATTCCGCAAACC 

SGCKL 1 NAKPPKY 3 ^ V L S G E W N A V Y K A F G 

TTCACCATT TATTACAAATGGTATATCATTGCTAGATATAATTGTTAAACCAGTTTTCTTTAATGCTTTTGTTAAATGCAATTGTGGTTCTG ■ 
AAGTGGTAAATAATGTTTACCATATAGTAACGATCTATATTAACAATTTGGTCAAAAGAAATTACGAAAACAATTTACGTTAACACCAAGAC 

aPF-IT-NOISLLDIIVKPy FFNAFVKCNCGS 
s p F ' 1 N b Replicase 1a ; 

AGAATTGGAGTGTTGGTGCATGGGATGGTTATCTATCTTCTTGTTGTGGCACACCTGCTAAGAAACTTTGTGTTGTTCCTGGTAATGTTG 
iciTAAciTCACAACCAiGTACCCTAciAATAGATAGiAGAACAACACCGTGTGGACGATTCTTTGAAACACAACAAGGACCATTACAACAA 

ENWSVGAW DGYLS S Z^ J PAK KLCVVPGNVV 



CC TGGTGATGTGATCATCACCTCAACTGATGCTGGTTGTGGTGTTAAATACTATGCTGGCTTAGTTGTTAAACATATTACTAACATTACTGG 

ggaccactacactagtagtggagttgactacgaccIacaccacaatttatgatacgaccgaatcaacaatttgtataatgattgtaatgacc 

PGDVI ITSTDAGCGVKY YA6LVVKH 1 TN 1 TG 
p G D v 1 1 Replicase 1a 



TGTGTCTTTATGGCGTGTTACAGCTGTTCATTCTGATGGAATGTTTGTGGCAACATCTTCTTATGATGCACTTTTGCATAGAAATTCATTAG 



i 1 1 1 1 I 1 1 ' 1 

ACACAGAAATACCGCACAATGTCGACAAGTAAGACTAC 



1 " ' 1,11111 ' ! '— ■ -CJTACAAACACCGTTGTAGAAGAATACTACGT.GAAAACGTATCTTTAAGTAATC 



VSLWRVTAVHSOG » FV ^A TS9YDALLHRNSL 



-Replicase 

ACCCT TTTTGCTTTGATGTTAACACTTTACTTTCTAATCAATTACGTCTAGCTTTTCTTGGTGCTTCTGTTACAGAAGATGTTAAATTTGCT 
TGGGAAAAACGAAACTACAATTGTGAAATGAAAGATTAGTTAATGCAGATCGAAAAGAACCACGAAGACAATGTCTTCTACAATTTAAACGA 

0 P F C F D V N T L L S N Q L -R L AFLGAS VTEDVKFA 
u r r Replicase 1a 



^TARr.ACTGGTGTTATTGACATTAGTGCTGGTATGTTTGGTCTTTACG ATGACATATTGACAAACAATAAACCTTGGTT 

1 1 1 - 1 1 1 1 ! - • i ■ i ■ ■ i ■ ■ 1 ■ i 1 ■ 1 1 ' 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ' ' ' ZZZZ ! . „ I 



TTGGTTTGTACGCAAAGC 
I ■ 1 



CGATCGTGACCACAATAACTGTAATCACGACCATACAAACCAGAAATGCTACTGTATAACTGTTTGTTATTTGGAACCAAACATGCGTTTCG 
5TGVI D ISAGMF GLYDD I LTNNKPWF V R K J 



A J ' " ' ! - — : — - — - - — — Replicase 1a 



TTCTGGGCTTTTTGATGCAATCTGGGATGCTTTTGTTGCC^ 

aagacccgaaaaactacgttagaccctacgaaaacaacggcgataatIcgaacacggttgatgatgaccaccaaaccaatccaaacaattca 

q|_FDA IWDAFVAA1KLVPTTTGGLV R I F V K 



s " *- ' - — - — I — : — I — - — : Replicase 1a 



CTATCGCTTCAACTGTTTTAACT6TTTCTAATGGTGTTATTA 

GAiAGtGAAGnGAciAAATlGACAAAGATlACCACAATAATAATACACACGTCTACAAGGTCTACGAAAAGTTGGTCAAATGGCGTGTAA/ 

SIASTVLTVSNGV1 I M C ADVPDAFQPVYRTF 
J — Replicase 1a " 
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EMCR-CoV.MPD (1 > 27532) Site and Seq uence ' 



ACACAAGCTATTTGTGCTGCATTTGATTTT^ 

TGTGTTCGATAAACACGACGTAAACTAAAAAGAAATCTACATAAATTTTAACCACTACAATTTAAATTTGCTGAACCACTAATACAAGAAT 

T 0 A 1 CAAFDFSLDVFK IGDVKFKRLGDYVL 
Replicase 1a- — 



TGAAAATGCTCTTGTTCGTTTGACTAC TGAAGTTGTTC6TGGTGTTCGTGATGCTCGCATAAAGAAAGCCATGTTTACTAAAGTAGTTGTA 
11 1 t i .... 1 .... i ... . i .... i ■ - 1 ■ * ■ ■ I ■ ■ ■ ■ i ' ■ ■ | - i i i i | , , , , j , , , t 

ACTTTTACGAGAACAAGCAAACTGATGACTTCAACAAGCACCACAAGCACTACGAGCGTATTTCTTTCGGTACAAATGATTTCATCAACAT 

E N A L V R L T TEVVRGVRDAR I KKAMFTKVVV 
" " — Replicase 1 a — — L 

gtcctacaac tgaagttaagtttVctgttattgaacttgccactgttaatttgcgtcttgttgattgtgcacctgtagtttgccctaaagg 
' ' ' ' 1 ' ' ' 1 1 ' ' 1 1 1 ' ' ' ' 1 ' ' ' ' 1 ' 1 ' ' 1 1 ' ' 1 1 ' 1 1 ' t ' ' ■ ' t ' 1 1 1 1 I ' ' ' ■ I ■ ' ■ ■ I ■ . . . I ■ . . ■ 1 , , , , 1 , , , , I 

caggatgttgacttcaattcaaaagacaataacttgaacggtgacaattaaacgcagaacaactaacacgtggacatcaaacgggatttcc, 

6 P J T E VKFSVIELATVNLRLV DCAPVVCPKG 
~" — ■ — Replicase 1a — — 

AAAATTGTTGTTATTGCTGGACAAGCTTTTTTCTATAGTGGTGGTTTTTATCGTTTTATGGTTGATTCTACAACTGTATTAAATGACCCTG* 
' ' 1 1 1 1 ' 1 ' ' ' ' 1 1 ' ' 1 1 I ■■■■ l .... 1 .... i ■ ... r ■ ... i ■ . .. | I ■ • • I - 1 . ■ ■ . i ■ ■ , , | , . , 

TTTTAACAACAATAACGACCTGTTCGAAAAAAGATATCACCACCAAAAATAGCAAAATACCAACTAAGATGTTGACATAATTTACTGGGAC) 

K I VVIAGQAFFYSGGFYRFMVDSTTVLNDP 

■ Replicase 1a — _ 



TTTTACTGGTGAGTTATTTTATACTATTAAGTTTAGTGGTTTTAAGCTTGATGGTTTTAACCATCAGTTTGTTAATGCTAGTTCTGCTACAC 
' ' ' ' ' 1 1 ' 1 11 ' 1 1 ' ' 1 ' I ' ' ' 1 1 ' ' 1 1 1 ' ' ' ' ' * ' ■ 1 I I ' ' ■ ' i ■ ■ ■ ■ I ■ ■ » . i . . ■ ■ | i . . . i ■ , , , i 

AAAATGACCACTCAATAAAATATGATAATTCAAATCACCAAAATTCGAACTACCAAAATTGGTAGTCAAACAATTACGATCAAGACGATGTC 

g T G E LFYT I KFSGFKLOGFN. HQFVNAS S A T 
— Replicase 1a 



ATGCCATTATTGCTGTTGAGCTGTTGTTATCGGATTTTAAAACTGCAGTTTTTGTGTACACATGTGTGGTTGATGGTTGTAGTGTCATTGT1 
' ' ' ! ' ' ' ' 1 ' ' ' 1 1 ' ' ' ' ' 1 I » .■■ i ... i i .... i , ■ t | ....... | .... , | , , , , j 

TACGGTAATAACGACAACTCGACAACAATAGCCTAAAATTTTGACGTCAAAAACACATGTGTACACACCAACTACCAACATCACAGTAACA/s 

D A I I A V E L L L S D F X T A V F V Y T C V V D G C S V I V 
Replicase 1a^ 1 

AGACGTGATGCTACATTCGCCACACATGTGTGTTTTAAGGACTGTTATAGTATTTGGGAGCAATTCTGCATTGATAATTGTGGTGAGCCATG 
11 1 l «... i .... i i ... i ■ . ■ . i .... i , . . . . j , , , , , , 

TCTGCACTACGATGTAAGCGGTGTGTACACACAAAATTCCTGACAATATCATAAACCCTCGTTAAGACGTAACTATTAACACCACTCGGTAC 

RR DATFATHVCFKDCYSIWEQFC IDNCGEPV 
— Replicase 1 a — 1 



1 1 t t .... I I .... i .... I | ■ 1 1 1 T ■ • | • • • 1 1 ■ ■ i , | , , 

CAAAAACTGACTAATATTACGATAGAACGTCTCATTATTGGGAGTTACACGATAACAAGTTCGTAGCCTCAGATTTCAAAACGAACTCTCCA 

F L T D Y ' N A I LQSNNPQCAIVQASESKVL'LER 

— Replicase 1 a — 



TTTTACCTAAGTGTCCTGAAATACTGTTGAGTATTGATGATGGCCATTTATGGAATCTTTTTGTTGAAAAGTTTAATTTTGTTACAGATTGG 
' ' 1 1 ' ' ' 1 1 ' ' ' ' 1 r ■■■■ I .... t ■ ... I | | | - • | i i , i | i , 

AAAATGGATTCACAGGACTTTATGACAACTCATAACTACTACCGGTAAATACCTTAGAAAAACAACTTTTCAAATTAAAACAATGTCTAACC 

FLPKCPEI L LSIDDGHLWNLFVEKFNFVTDW 
——————— — Replicase 1a ■ — — ■ _ 
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TTAAA AACTCTTAAGCTTACACTTACTTCTAATGGTCTTTTAGGTAATTGTGCCAAACGTTTTAGACGTGTTTTGGTAAAATTGCTTGAT 
AATTTTTGAGAATTCGAATGTGAATGAAGATTACCAGAAAATCCATTAACACGGTTTGCAAAATCTGCACAAAACCATTTTAACGAACTACA 

LKTLKLTLTSNGLLGNCA KRFRRVLVKLLOV 
L K ' L — Replicase 1a 



CTATA ATGGTTTTCTTGAAACTGTCTGTAGTGTCGTACACACTGCTGGTGTTTGCATTAAATATTATGCTGTTAATGTTCCATATGTAGTTA 
GATATTACCAAAAGAACTTTGACAGACATCACAGCATGTGTGACGACCACAAACGTAATTTATAATACGACAATTACAAGGTATACATCAAT 



Y N G F L E T V C S V V H T A..0 



TT AGTGGTTTTGTAAGTCGTGTAATTCGTAGAGAAAGGTGTGACGTGACTTTTCCTTGTGTTAGTTGTGTCACTTTTTTCTATGAATTTTTA 

aatcaccaaaacattcagcacIttaagcatcIctttccacactgcactgaaaaggaacacaatcaacacagtgaaaaaagatacttaaaaat 

ISGFVSR^V I R R E R C D^T J PCVSCVTFFYEFL 



gacacgtgttttggtgttagtaaacctaatgccattgatgttgaacatttagagcttaaagaaactgtttttgttgaacctaaggatggtgg 

iiGTGCACAAAACCACAATiATTTGGATTACGGTAACTAiAACTTGTAAATCTCGAATTicTTTGACAAAAACAACTTGGATTCCTACCACC 
D T C F G V S K P N A I D V.E..H. E L K E T V F V E P K 0 G G 

TCAAT TTTTTGTTTCTGATGATTATCTTTGGTATGTTGTAGATGACATTTATTATCCAGCTTCATGTAATGGTGTATTGCCAG^ 
AGllAAAAAACAAAGAclACTAATAGAAACCAlACAAiATCTACTGTAAATAATAGGTCGAAGTACATTACCACATAACGGTCAACGAAAAT 

OFFV SDD Y LWYVVD D. I Y Y P A S C N G V L P V A F 
u r Replicase 1a — 



CAAA ATTGGCAGGTGGTAAAATATCTTTTTCTGATGATGTTATAGTTCATGATGTTGAACCTACCCATAAAGTCAAGCTCATATTTGAGTTT 
GTTTTAACCGTCCACCATTTTATAGAAAAAGACTACTACAATATCAAGTACTACAACTTGGATGGGTATTTCAGTTCGAGTATAAACTCAAA 

T 



KLAGGK 1 SFSD D V [VHDVEPTHKVKLIFE F 



Replicase 1a 

GTTGTTACCAGTCTTTGTAAGAAGAGTTTTGGTAAGTCTATTATTTATACAGGTGATTGGGAAGGTTTACATGAAGTTCTTAC 



GAAGA TGATGTTGTTACCAGTCTT I U I aauaauHq i i i \^<a i hmu i ^ iqj ■ ^ , ; j ~ J " , , , , , , , , | , , , , { 

CTTCTACTACAACAATGGTCAGAAACATTCTTCTCAAAACCATTCAGATAATAAATATGTCCACTAACCCTTCCAAATGTACTTCAAGAATG 

EDDVVTSLCKKSFGKS l IYTGDWEGLHEVLT 
^ . Replicase 1a — 



ATC TGCAATGAATGTCATTGGGCAACATATTAAGTTGCCACAATTTTATATTTATGATGAAGAGGGTGGTTATGATGTTTCTAAACCAGTTA 

I » ■ 1 11 11 1 ' H 1 ' H ' ' ' ' ^ 1 1 ' 111 ' 1 ' ' ' 1 '"'ill ' ' ''11 1 ^-lilllr.TT.lTr.nr Irr a at ArTAr A A Afl ATTTCCTT. A AT 



TAGACGTTACTTACAGTAACCCGTTGTATAATTCAACGGTGTTAAAATATAAATACTACTTCTCCCACCAATACTACAAAGATTTGGTCAAT 

SAMNV I GQH I K LP Q F Y I YOEEGGYDVSK PV 
_ — Replicase 1a ■ 



TGATTTCACAATGGCCTATTAGTGATGATAGTGATGGTT^ 



ACTAAAGTGTTACCGGATAATCACTACTATCACTACCAACACAACAACTTCGCTCGTGACTAAAAGTAGTTAATCTTAGACAATCTCTTCTC 

MI SQV/P I SDOSDGCVVE ASTDFHOLESVREE 
" Replicase 1 a 



^"I""^"^^"*"^^"^^"^ - T G AAC A AC C T T TT<3<3 T"I~G CA~l"GCGC"irCTCAA*FTA G AC A/\.CCTTTrTTCTT~TT~rC"rTTT T ACS ~TCSA A "T T <3 15 C3 ~T G 

CAACTATATTAACTTGTTGGAAAACCCCTTCAACTTGTACGCGAGAGTTAATCTGTTGGAAAAAGAAAAAGAAAATCTCTACTTAACCCAC. 

v 0 I IEQPFGEVEHALS IRQPFSFSFRDELG 
— Replicase 1a _____ 

TCGTGTTTTA GATCAATCTGATAATAATTGTTGGATTAGTACCACACTTATACAGTTGCAACTTACAAAGCTTTTGGATGATTCTATTRARi 
1 11 11 1 1 1 1 I . i ■ i i i ... i | | i i .i 

AGCACAAAATCTAGTTAGACTATTATTAACAACCTAATCATGGTGTGAATATGTCAACGTTGAATGTTTCGAAAACCTACTAAGATAACTC" 

RVLDQSDNNCW ISTTLIQLQLTKLLDDSIE 
— — ■ Replicase 1a' — 

TGCAATTGTTTAAA GTTGGTAAAGTTGATTCAATTGTTCAAAAGTGTTATGAGTTGTCTCATTTAATTAGTGGTTCACTTGGTGATAGTGG1 

' ' 1 I ' ' 1 1 ' ' ' 1 ' I ' ■ ' ' t ' ■ ' ' 1 ' ' ' ' I ■ ■ ' ' I ■ ' I ■ i i i i i , i , | | .... , , , j , , ! t 

ACGTTAACAAATTTCAACCATTTCAACTAAGTTAACAAGTTTTCACAATACTCAACAGAGTAAATTAATCACCAAGTGAACCACTATCACC/ 

M Q L F K VGKVDS I VQKCYELSHL I SGSLGDSG 
— — —Replicase 1a ■ . — 

AAACTTCTTAGTGAACTTCTTAAAGATAAATATACATGTTCTATAACTTTTGAGATGTCTTGTGATTGTGGTAAAAAGTTTGATGAGCAAG1 
' 1 1 1 i ... i .... I i . , ■ , i ! | , i , | i , | i , , I 

TTTGAAGAATCACTTGAAGAATTTCTATTTATATGTACAAGATATTGAAAACTCTACAGAACACTAACACCATTTTTCAAACTACTCGTTC^ 

KLLSELLKDKYTCSITFEMSCDCGKKFDEQi 
•■ -Replicase 1a— — 

TGGTTGTTTGTTTTGGATTATGCCTTACACAAAACTTTTTCAAAAAGGTGAGTGTTGTATTTGTCATAAAATGCAGACTTATAAGCTTGTTA 
ACCAACAAACAAAACCTAATACGGAATGTGTTTTGAAAAAGTTTTTCCACTCACAACATAAACAGTATTTTACGTCTGAATATTCGAACAAT 

GOLF W I M PYT KLFQKGECC I-CHKMQTYK'LV 
1 Replicase 1 a ■ : 



GTATGAAAGGTACTGGTGTGTTTGTACAGGATCCAGCACCTATTGACATTGATGCTTTCCCTGTTAGACCTATATGTTCATCTGTATATTTA 

| .... I ! ... | | . . . , , , , , , | , , , , , , , [_, , [ , , . 

CATACTTTCCATGACCACACAAACATGTCCTAGGTCGTGGATAACTGTAACTACGAAAGGGACAATCTGGATATACAAGTAGACATATAAAT 

SMKGTGVFVQDPAP IDIDAFPVRPICSSVYL 
-Replicase 1a 



GGTGTTAAGGGTTC TGGTCATTATCAAACAAATTTATACAGTTTTGACAAAGCTATTGATGGTTTTGGTGTCTTTGACATTAAAAATAGTAft 

I 11 11 * 1 ' 11 I ' ' 1 .... I ■ ... I I i i . ■ | ■ ■ , I | 

CCACAATTCCCAAGACCAGTAATAGTTTGTTTAAATATGTCAAAACTGTTTCGATAACTACCAAAACCACAGAAACTGTAATTTTTATCATC 

G v K G S GHYQTNLYSFDKA I DGFGVFD I KNSS 
■ Replicase 1 a ■ . 



' ' i ' ■ ■ ■ ' • i 1 ' » - ' ' ' ' ■ ■ ■ . i '■ . : : : ' |":~:y. , , , 



ACAATTATGACAAACAAAACAACTACAACTAAAAGTATCACATCTTTATCTTCGACCACTTCAATTTGGAAAACGACATATATTTTTACAAT 

VNTVCFVDVDFHSVEIEAGEVKPFAVYKNV 
Replicase 1a^ 



AATTTTATTTA ? GTGATATTT CACACCTTGTM ^ 

' ' ' ' ' ' ' ' ' ' ' ' 1 I i i . ■ ■ i ■ . ■ , i t , i, ,, ! 

TTAAAATAAATCCACTATAAAGTGTGGAACATTTGACACAAAGAAAACTGAAACAACAGTTACGACGATTACTTTTAGAGTACGTACCTCCG 

KFYLGDI SH LVNCVSFDFVVNAANENLMHGG 
"" " — — Replicase 1a' . 



)1 December 2003 1 1 :52 j ^ / & 1 Pi 
=MCR-CoV.MPD (1 > 2753 ? ) Site and Sequence . 

G GTGTCGCACGTGCTATTQATATTTTGACTGAAGQTCAACTTCAGTCATTATCTAAAGATTACATTAGTAGTAATGGTCCACT ^ 

ccacagcgtgcacgataacIataaaactgacttccagttgaagtcagtaatagatttctaatgtaatcatcattaccaggtgaattccaacc 

GVARA10ILTEGQ L r 0, ^ L, S K0YISSNGPLKV6 j 



ag caggtgttatgttggagtgtgaaaaattcaatgtatttaatgttgttggtccgcgaactggtaaacatgagcattcattacttgttgaag ^ 

TCGTCCACAATACAACCTCACACTTTTTAAGTTACATAAATTACAACAACCAGGCGCTTGACCATTTGTACTCGTAAGTAATGAACAACTTC 

AGVMLECEKFNVFNVV 6 PRTGKHEHSLLVE 
w , Replicase 1a — - — — 



CTTATAATTCTATTTTATTTGAAAATGGTATTCCACTTATGCCTCTTCTTAGTTGTGGTATTTT.TGGTGTAAGGATTGAAAATTCTCTTAAA ^ 



gaataItaagataaaataaacttttaccataaggtgaatacggagaagaatcaacaccataaaaaccacattcctaacttttaagagaattt 

A Y N S ! L F E » 0 > P L H r P r „ L ,L^ 5 C G I F 0 V R I E N S L K 



gctttgtttagttgtgacattaataaaccattgcaagtttttgtttattcttcaaatgaagaacaagctgttcttaagtttttagatggttt 
cgaaacaaatcaacactgtaattatttggtaacgttcaaaaacaaataagaagtttacttcttgttcgacaagaattcaaaaatctaccaaa 

ALFSCD I NKPLQVF V YS SNEEOAVLKFLDGL 
H L Replicase 1a — 

agatttaacaccagtcattgacgatgttgatgttgttaaa^ l 
tctaaattgtggtcagtaactgctacaactacaacaatttggaaaatctcaacttccattaaaaagtaagaaactaacac 

DL TPVIDDVDVVKPFRVE GNFSFFDC GV NA 
p L ' v vi» Replicase 1a 



TGGAT GGTGATATTTACTTATTATTTACTAACTCTATTTTAATGyGGATAAACAAGGACAATTATTGGACACAAAACTTAATGGTATTTTG ^ 
ACCTACCACTATAAATGAATAATAAATGATTGAGATAAAATTACAACCTATTTGTTCCTGTTAATAACCTGTGTTTTGAATTACCATAAAAC 

LDGD I YLLFTNS I L M L 0 KOGQLLDTKLNG I L 
L — Replicase 1a 



Replicase 

CAACA GGCAGTTCTTGATTATCTTGCTACAGTTAAAACTGTACCAGCTGGTAATTTGGTTAAACTTGTTGTTGAGAGTTGTACCATTTATAT 
GTTGTCCGTCAAGAACTAATAGAACGATGTCAATTTTGACATGGTCGACCATTAAACCAATTTGAACAACAACTCTCAACATGGTAAATATA 

QQAVLDYLATVKTVPA G NLVKLVVESCT I YM 
— ■ Replicase 1 a " 



GTGTGTTGTACCATCGATAAATGATCTTTCTTTTGATAAAAATCTTGGTCGTTGTGTGCGTAAACTTAATAGATTGAAAACTTGT 



GTTATTG 



CACACAACATGGTAGCTATTTACTAGAAAGAAAACTATTTTTAGAACCAGCAACACACGCATTTGAATTATCTAACTTTTGAACACAATAAC 

CVVPS I NDLSFDKNLGRCV RKLNRLKTCV1 
. ■ Replicase 1a ■ 



r.HAATRTTCCTGCTATTGATGTTTTGAAAAAGCTTCTTTCAAGTTTGACTTT AACTGTTAAATTTGTTGTAGAGAGTAATGTTATGGATGTT 
I | . ■ . . i i ■ ■ i I ' ■ ■ i i ■ ' ' ' I I ' 1 ' ' 1 ' ' ' ' ' ' ' ' 1 1 11 1 ' ' ' -+-^-^- i-*-—* 11 ' 



GGTTACAAGGACGATAACTACAAAACTTTTTCGAAGAAAGTTCAAACTGAAATTGACAATTTAAACAACATCTCTCATTACAATACCTACAA 

A N V P A I D V L K K L L S S L T. L TVKFVVESNVMDV 
Replicase 1 a 



AACGACTGTTTTAAGAATGATAATGTAGTTTTGAAAATTACTGAAGATGGTATTAATGTTAAAGATGTTGTTGTTQAGTCTTCTAAGT CAC 
TTGCTGACAAAATTCTTACTATTACATCAAAACTTTTAATGACTTCTACCATAATTACAATTTCTACAACAACAACTCAGAAGATTCAGTG 

' ° C F K N ° N V V L K 1 - Replicase 

T6GTAAACAATTGGGTGTTGTGAGTGATGGTGTTGACTCTTTTQAAGGTGTTTTACCTATTAATACTGATACTGTCTTATCTGTAGCT CCA 
ACCATTTGTiAACCCACAACACTCACTACCACAAiTGAGAAAAclTCCA<!AAAAiGGATAATTAiGACTAiGACAGAATAGACAicGAGGT' 

-Replicase «- " r ' " ' ° T V L S V A * 



G K Q LGVVSOGVDSFEGVL PIN TDTVLSVA 



AAGTTGACTGGGTTGCTTTTTACGGTTTTGAAAAGGCAGCACTTTTTGCTTCTTTGGATGTAAAGCCATATGGTTACCCTAATGAT TTTGT 
TTCAACTGACCCAACGAAAAATGCCAAAACTTTTCCGTCGTGAAAAACGAAGAAACCTACATTTCGGTATACCAATGGGATTA ' " " 



CTAAAACA, 

- • " - _ v k r r u y P N D F V 
Replicase 1a 



E_V_D W VAFYGFEKAALFASLDVKPYGYPNDF 



GGTGGTT 



TTAGAGTTCTTGGGACCACCGACAATAATTGTTGGGTTAA TGCAACTTGTATAATTTTACAGTATCTTAAGCCTACTTTTAAATf 
. * . ' ' ' ' ' ' 1 I ■ ■ ■ i i . ■ ■ i i ■ ■ , ■ t ■ , | , , i i i , , ■ . 



CCACCAAAATCTCAAGAACCCTGGTGGCTGTTATTAACAACCCAATTACGTTGAACATATTAAAATGTCATAGAATTCGGATGAAAATTTAC 

GGFRVLGTTD NNCWVNATC I ILQYLKPTFk'' 
■ Replicase 1a utlkptfk. 

TAAGGGTTTAAATGTTCTTTGGAACAAATTTGTTACAGGTGATGTTGGACCTTTTGTTAGTTTTATTTATTTTATAACTATGTCTTC 

ATTCCCAAATTTA^AAGAAACCTTGTTTAAACAATGTCCACTACAACciGGAAAACAATCAAAATAAATAAAATATTGATACAGAAGTTTCC 

KGLNVLWNKFVTG DVGPFVSF I YF [ TMS'SK 
— Replicase 1a ■ . - n 5 s K 



GTCAAAAGGGTGATGCTGAAGAGGCATTATCTAAATTGTCAGAGTATTTGATTAGTGATTCTATTGTTACTCTTGAACAATATTCAA CTTGT 

cagttttcccactacgacttctccgtaatagatttaacagtctcataaactaatcactaagItaacaatgagaactIgttaIaagtIgaaca' 



GQKGDAEEALS KLSEYL ISDS IVTLEQYSTr 
Replicase 1a — - — - ! Q Y S T c 

GACATTTGTAAAAGTACTGTAGTTGAAGTTAAAAGTGCTGTTGTCTGTGCTAGTGTGCTTAAAGATGGTTGTGATGTTGGTTTT TGTCCACA 
CTGTAAACATTTTCATGACATCAACTTCAATTTTCACGAiAACAGACACGATCACACGAATTTciACCAACACTACAACCAAAAACAGGTGi 

D ' C K S T V V E V K S A V V C A S V L K D G C D V G F C P H 

Replicase 1a — — ° K H 



rV-r"; T"v -r - ; ' J .■ ■ m.u».ftu, mjj t GTTAT-TACCAATGT-TGGTGAACeTAT-AATT-T CACAACCTTCTAAGT- 

GTCTGTATTTAACGCAAGTGCACAATTCAAACAATTACCTGCACAACAATAATGGTTACAACCACTTGGATATTAAAGTGTTGGAAGATTCA 



RHKLRSRVKFVN GRVVITNVGEPIISQPSK 
" ■ Replicase 1a a u ^ s K 



TGCTTAATGGTATTGCTTATACAACATTTTCAGGTTCTTTTGATAACGGTCACTATGTAGTTTATGATGCTGCTAATAATGCTG 
ACGAATTACCATAACGAATATGTTGTAAAAGTCCAAGAAAACTATlGCCAGTGATACATCAAATAifACGACGATiATTACGACAGATACTA 



)1 December 2003 1 1 :52 

?MfiR-finV.MPD (1 > 2753?) Site and Sequence 



Of or f 



RfiTRr.TrGTTTATTTGCTTCAGATTTGTCTACTT TAGCTGTTACAGCTATTGTTGTAGTAGGTGGTTQTGTAACATCTAATGTTCCACCAAT 
i i i l i i i ' l 1 1 1 1 I I I i i i i 1 i . 1 » » ' * ' ' f . >Tf .'_ f . ' . ..:.,. T tAtaoattap aapptpptta 



CCACGAGCAAATAAACGAAGTCTAAACAGATGAAATCGACAATGTCGATAACAACATCATCCACCAACACATTGTAGATTACAAGGTGGTTA 

G A R L F A S D L 3 T L A V „ T A I VVVGGC VTSNVPP 
b fl K L r -Replicase 1a 



TGGTGCACAAAAATTTTTCCAATTTGGTGATTTTGTTATGAATAACATTGTTC 

\ i i i < i i i i ■ | * i ■ ' i 1 1 1 ' 1 1 1 1 1 i 1 1 1 1 » 5; 



TGTTAGTGAGAA AATTTCTGTTATGGATAAACTTGATAC ^ 
ACAATCACTCTTTTAAAGACAATACCTATTTGAACTATGACCACGTGTTTTTAAAAAGGTTAAACCACTAAAACAATACTTATTGTAACAAG 



KFFQFGDFVMNN 1 V 




ACAAAAATTGAACCAACGAATCATAC.AAATCAGAAAATGCATG 

_ . - . i e M IT «! I I R 

•RepHcase 1a 



LFLTWLLS M F S L L R „T,„S, I M K H D I K V I A K A P K R 



ACAGGTGTTAT TTTGACACGTAGTTTTAAGTATAACATTAGATCTGCTTTGTTTGTTGTAAAGCAGAASTGGTGTGTTATTGTT ACTTTGTT 
' ' ' 1 ' ' ' ' ' ' ' ' ' - - 1 > 1 *^acGAAACAAACAACATTTCGTCTTCACCACACAATAACAATGAAACAA 



TGTCCACAATAAAACTGTGCATCAAAATTCATATTGTAATCTAG 
T G V I L T R 



SFKYNIRSALFV VKQKWCV1VTLF 
Replicase 1a— — ! 



TAAGTTCTTATT GTTATTATATGCTATTTATGCACTTGTTTTTATGATTGTGCAATTTAGTCCTTTTAATAG TCTTTTATGTGGTGACATTG ^ 
luCAAGAATMCAAlAATAlACGAlAAATACGTGAACAAAAATAiTAACACGTTAAATCAGGAAAATTATCAGAAAATACACCACTGTAAC 

K F L L L L Y A I Y A L V F M I V Q F S P F N 3 L L C G D I 
K h L L Replicase 1a- 



TAAGTGGTTA TGAAAAATCCACTTTTAATAAGGATATTTATTGTGGTAATTCTATGGTTTGTAAGATGTGTTTGTTTAGTTATCAAGA GTTT ^ 
ATTCACCAATACTTTTTAGGTGAAAATTATTCCTATAAATAACACCATTAAGATACCAAACATTCTACACAAACAAATCAATAGTTCTCAAA 



V SGY EKST FNKD1Y ^.J^S 



MVCK MCLFSYQEF 



AATGATTTGGATCATACTAGTCTTGTTTGGAAGCACATTCGTGATCCTATATTAATCAGTTTACAACCATTTG 



TTATACTTGTTATTTTGTT 



- i 1 1 1 1 I 



tIactaaacctagtatgatcagaacaaaccttcgtgtaagcactaggatataattagtcaaatgttggtaaacaatatgaacaataaaacaa 

N D L D H T S L V W K H I R P„ P I L I S L Q P F V I L V 1 L _ _L 
N u L u n . — Replicase 1 a 




TTAAAAACCATTATACATAAACGCAAAACCTGAAAATATAAAACAACGT 
I F G N M Y 



L R F G L L Y F V A Q F I STFGSFLG FHQ 
— — Replicase 1a — 



&Ar.ARTGGTTTTTACATTTTGTGCCGTTTGATGTTTTATGTAATGAGTTTTTAGCTACATTTATTGTCTGCAAAATTGTTTTATTTGTTAGA 



■ I I 



TTGTCACCAAAAATGTAAAACACGGCAAACTACAAAATAC 
KQWFLHF VPFDVLC ^E__ F„L_ 



, t i i i i | ii|hi | I 

ATTACTCAAAAATCGATGTAAATAACAGACGTTTTAACAAAATAAACAATCT 



ATF1VCK1VLFVR 



wi u^wcihuoi __uvo i i;o__ ^T/ Q -4 
EMCR-CoV.MPD (1 > 27532) Site and Sequence 

CATATTATTGTTGGCTGTAATAATGCTGACTGTGTAGCTTGTTCTAAAAGTGCTAGACTTAAACGTGTACCA 

gtataataacaaccgacattattacgactgacacatcgaacaagattttcacgat_tgaaItt.gcacatggtgaagtttgataataatta( 

H ' 'VGCNNADCVACSKS ARLKRVPLQT I IN 



■Replicase 1a- 



TATGCATAAATCATTCTATGTTAATGCTAATGGTGGTACTTGTTTCTGTAATAAACATAACTTCTTTTGTGTTAATTGTG ATTCTTTTGGC 

atacgtatttagtAagatacaattacgattaccaccatgaacaaagacattatItgtattgaagaaaacacaaItaacactaagaaaaccc 

MHKSFYVNANGGT ^NKHNFFC VNCDSFG 



CTGGTAATACTTTTATTAATGGTGATATTGCAAGAGAGCTTGGTAATGTTGTTAAAACAGCTGTTCAACCCACAGCTCCTGCATATGT TAT 

gaccattatgaaaataattaccactataacgttctctcgaaccattacaacaattttgtcgacaagItgggIgtcgaggacgtatacaata 

PGNTF I NGD I AREL G NVVKTAVQPTAPAYVI 
Replicase 1a ■ « r a t v i 

ATTGATAAGGTAGATTTTGTTAATGGATTTTATCGTCTTTATAGTGGTGACACTTTTTGGCGGTATGACTTTGACATTACTGAATC TAAGT. 
TAACTATTCCATCTAAAACAATTACCTAAAATAGCAGAAATATCACCACTGTGAAAAAC.GCCATACTGAAACTGTAATGACTTAGATTCA 



lDKVDFVNGFYR L YSGDTFWRYDFD I T E S K 
~ Replicase 1a — 



TAGTTGTAAAGAGGTTCTGAAGAATTGTAATGTTTTAGAAAATTTTATTGTTTACAATAATAGTGGTAGTAACATTACACAGATTAAA AAT( 

ATCAACATTTCTCCAAGACTTCTTAACATTACAAAATCTTTTAAAATAACAAATGTTATTATCACCATCATTGTAATGTGTclAATTiTTAC 

S CKEVLK NCNV L.E N F I V Y N N SGSN1TQ I KN 
— — Replicase 1a — — ___________ 

CTTGTGTTTATTTTTCTCAATTGTTGTGTGAACCTATAAAGTTGGTAAATTCAGAGTTGTTGTCAACTTTATCAGTTGATT TTAATGGTGTT 

gaacacaaataaaaagagttaacaacacacttggatatttcaaccatttaagtctcaacaacagttgaaaIagtcaactaaaattaccacaa 

A C V Y F 5 Q L L C E P I K L .V N SELL STLSVDFNGV 
—Replicase 1a- -—-_-_______ 

TTGCATAAGGCATATGTTGATGTTTTGTGTAATAGTTTTTTTAAGGAGCTAACTGCTAACATGTCCATGGCTGAATGTAAAGCT ACACT^ 
AACGTATTCCGTAfACAACTACAAAACACATTATCAAAAAAATlcCTCGATTGACGATlGTACAGGTACCGAclTACAiTTCGATGiGAACC 

L " K A Y V ° V L C N 3 F V plicfse I T * " " * " * * C K A T L E 



■ rr . -n-r - "I - ,T | - , - , y ",; j ^"^^'^^^ M|ib,MHbAbATTT6TC-ATTT-AA-TAATTTTT 

aaactgacaaagactactactaaaacaaagtcgacaacggttacgtgtatccatactgcaaaacgaaagtctaaacagtaaattatIaaaaa 

LTVSDDDFVSAVA NAHRYDVLLSDLSFNNF 

~~ -Replicase 1a- 



TTATTTCTTATGCTAAACCTGAAGATAAGTTGTCCGTTTATGACATTGCTTGTTGTATGCGTGCCGGTTCTAAGGTTGTTAACC ATAA^ 

AATAAAGAATACGATTTGGACTTCTATTCAACAGGCAAATACTGTAACGAACAACATACGCACGGCCAAGATTCCAACAATTGGTATTACAA 

FISYAKPEDKLSVY D 1 A C C M R A G S K V V N H N V 

■ Replicase 1a> _ v 



D1 December 2003 1 1 :52 1 0 / * f Ka S 
EMCR-CoV.MPD (1 > 27532) Site and Sequence 

TTAA TCAAAGAGTCAATACCTATTGTTTGGGGTGTCAAGGACTTTAA ^ 
AATTAGTTTCTCAGTTATGGATAACAAACCCCACAGTTCCTGAAATTATGAGAAAGAGTTCTTCCATTCTTCATGGAACAATTTTGTTGATT 

, .KES1PIVWGVKDFNTLSQEG K KYLVKTTK 
*- Replicase 1a— ~~~ ' 



AGCAA AGGGTTTGACTTTTTTATTAACTTTTAATGATAACCAAGCAATTACACAAGTTCCTGCTACTAGTATAGTTGCAAAACAGGGTGCTG ^ 
TCGTTTCCCAAACTGAAAAAATAATTGAAAATTACTATTGGTTCGTTAATGTGTTCAAGGACGATGATCATATCAACGTTTTGTCCCACGAC 

AKGLTF LLTFNDNOAITQV PATS 1VAKQGA 
A K b L ' — Replicase 1a- 



GTTTTAA ACGTACTTATAATTTTCTGTGGTATGTATGTTTATTTGTTGTTGCATTGTTTATTGGTGTCTCATTTATTGATTATACAACCACT ? 
CAAAATTTGCATGAATATTAAAAGACACCATACATACAAATAAACAACAACGTAACAAATAACCACAGAGTAAATAACTAATATGTTGGTGA 

G F K R T Y N F L W Y V C L F V V,,A L F I G V S F 1 0 Y T T T 



TTT 



G TAACTAGCTTTCATGGTTATGATTTTAAGTACATTGAGAATGGTCAGTTGAAGGTGTTTGAAGCACCTTTACACTGTGTTCGTAATGT ? 
CATTGATCGAAAGTACCAATACTAAAATTCATGTAACTCTTACCAGTCAACTTCCACAAACTTCGTGGAAATGTGACACAAGCATTACAAAA 

VTSFHGYDFKY IE N G 0 L KVFEAPLHCVRNVF 
v ' —Replicase 1a ■ 

TGATAAT TTTAATCAATGGCATGAGGCTAAGTTTGGTGTTGTTACTACTAATAGTGATAAATGTCCTATAGTTGTTGGTGTTTCAGAGCGTA 
ACTATTAAAATTAGTTACCGTACTCCGATTCAAACCACAACAATGATGATTATCACTATTTACAGGATATCAACAACCACAAAGTCTCGCAT 

D N F N 0 W H E A K F Q V V T T 



TTAAT GTTGTTCCTGGTGTTCCAACAAATGTATATTTGGTAGGAAAGACTCTTGTTTTTACATTACAGGCTGCTTTTGGAAACACAGGTGTT 
AATTACAACAAGGACCACAAGGTTGTTTACATATAAACCATCCTTTCTGAGAACAAAAATGTAATGTCCGACGAAAACCTTTGTGTCCACAA 

INVVPGVPTNVYLVGKTLVF T LQAAFGNTGV 
' — — Replicase 1a — 

TGTTATG ACTTTGATGGTGTTACCACTAGTGATAAGTGTATTTTTAATTCTGCTTGTACTAGGTTGGAAGGTTTGGGTGGTGACAATGTTTA 
ACAATACTGAAACTACCACAATGGTGATCACTATTCACATAAAAATTAAGACGAACATGATCCAACCTTCCAAACCCACCACTGTTACAAAT 

CYD FDGVTTSOKC I FNSACT RLEGLGGDNVY 
— Replicase 1a ■ — 



TTGTTACAA CACTGATCTTATTGAAGGTTCTAAACCTTATAGTATTTTACAGCCCAATGCTTATTATAAGTATGATGTTAAAAATTATGTAC 
AACAATGTTGTGACTAGAATAACTTCCAAGATTT6GAATATCATAAAATGTCGGGTTACGAATAATATTCATACTACAATTTTTAATACATG 



qyNTDL I EGSKPYS I LQP NAYYKYOVKNYV 
. Replicase 1a — 



RTTTTCCAGAAATTTTAGCTAGAGGTTTTGGCTTACGTACTATTAGAACTTTGG CTACACGTTATTGTAGAGTTGGTGAATGCCGTGAC 

• i ....... I .... t 1 1 1 1 1 1 1 1 1 1 ' 1 ' ' 1 1 ' ' 1 1 ' ' ' 1 1 ' ' ' 1 ' 1 



TCA 



CAAAAGGTCTTTAAAATCGATCTCCAAAACCGAATGCATGATAATC 

R F P E I L A R G F G L R T t R T LATRYCRVG-ECRDS 
. — Replicase 1a — 
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EMCR-CoV.MPD (1 > 27532) Site and Sequence 



CATAAAGGTGTTTGTTTTGGTTTTGATAAATGGTATGTTAATGATGGACGTGTTGATGACGGTTACATTTGTGGTGATGGTCTTATAGACC 
1 ' ' 1 ' 1 ' 1 ' ' 1 1 1 ' ' 1 ' ' ' ' 1 ' * ' ' 1 ' 1 1 ' > 1 1 ' 1 1 1 1 1 1 I 1 1 1 ■ ' ■ ' ' ' ( ■ ■ ' ' i i » ■ ■ I ■ ■ ■ » i . . . . | ■ . . . i ■ , , , | , 

GTATTTCCACAAACAAAACCAAAACTATTTACCATACAATTACTACCTGCACAACTACTGCCAATGTAAACACCACTACCAGAATATCTGG 

H K G VCFGFDKWYVNDGRVDDGY I CGDGL I D 
■ Replicase 1a- — __ 



TCTTGTTAATGTACTCTCAATCTTTAGTTCATCTTTTAGCGTTGTGGCTATGTCTGGACATATGTTGTTTAATTTTCTTTTTGCAGCATTT 
1 ' 1 ' ' ' ' 1 ' ' ' ■ 1 ' ' ' ' I ' 1 ' 1 1 ' ' ' ' 1 1 1 1 ' ' 1 ' ' 1 I t ■ ■ ■ ■ | 1 ■ ■ ■ ■ i ■ i ■ i | ■ i ■ i i i , i « | , i , i | i i , , | , , , 

AGAACAATTACATGAGAGTTAGAAATCAAGTAGAAAATCGCAACACCGATACAGACCTGTATACAACAAATTAAAAGAAAAACGTCGTAAA 

L V NVLSIFSSSFSVVAMSGHMLFNFLFAAF 

— Replicase 1a - — — — 



TTACATTTTTGTGCTTTTTAGTTACTAAATTTAAACGTGTTTTTGGTGATCTTTCTTATGGTGTTTTTACTGTTGTTTGTGCAACTTTGAT 
1 ' ' ' H ■ ■ ' ■ » ■ ■ « » 1 ■ ■ ■ ■ i ' ■ ■ « 1 ■ ■ ■ ' i ■ ■ » » I > ■ ■ > i ■ ■ . ■ | | | ■ i i ■ i i i i i | i i i i i i i i , | ,, <t [ 

AATGTAAAAACACGAAAAATCAATGATTTAAATTTGCACAAAAACCACTAGAAAGAATACCACAAAAATGACAACAAACACGTTGAAACTA, 

j T F LCFLVTKFKRVFGDLSYGVFTVVCATL I 
— — Replicase 1a 

AATAACATTTCTTATGTTGTTACTCAAAATTTATTTTTTATGTTGCTTTATGCTATTTTGTATTTTGTTTTTACTAGGACAGTGCGTTATGi 
1 ' ' ' ' 1 1 1 ' ' ' ' 1 ' ' ' ' 1 1 ' 1 ' 1 ' ' ' 1 1 ' ' 1 1 1 1 1 1 1 1 ■ ■ ■ ■ i ■ « i i | i i i i | i i i ■ | I .,.. ! ., 

TTATTGTAAAGAATACAACAATGAGTTTTAAATAAAAAATACAACGAAATACGATAAAACATAAAACAAAAATGATCCTGTCACGCAATAC( 

N N ISYVVTQNLFFMLLYA I LYFVFTRTVRY 
Replicase 1 a ■ 



TTGGATTTGGCATATTGCATACATTGTTGCATACTTCTTGTTAATACCATGGTGGCTTCTCACATGGTTTAGTTTTGCTGCATTTTTAGAGt 
' 1 ' ' ' 1 1 ' ' " 1 ' ' ' ' ' ' ' ' ' ' ' ' ' ' 1 ' ' ' ' ' ' ' ' ' 1 1 1 1 1 I 1 1 1 ■ i ' ' 1 1 1 ' ' ' ■ i ■ ' ' ' I ■ ' ■ ■ i ■ ■ ■ ■ I » » ■ ■ i ■ ■ ■ ■ \ ■ , ■ ■ t ■ . , ... 

AACCTAAACCGTATAACGTATGTAACAACGTATGAAGAACAATTATGGTACCACCGAAGAGTGTACCAAATCAAAACGACGTAAAAATCTCC 

WIWH IAY I VAYFLL 1 PWWLLTWFSFAAFLE 
• — Replicase 1 a : . 



TTTTACCTAATGTTTTTAAGTTAAAAATCTCTACTCAATTGTTTGAAGGTGATAAGTTTATAGGTACTTTTGAGAGTGCTGCTGCAGGTAC/ 
AAAATGGATTACAAAAATTCAATTTTTAGAGATGAGTTAACAAACTTCCACTATTCAAATATCCATGAAAACTCTCACGACGACGTCCATG1 

LLPNVFKLK I STQLFEGDKF I GTFESAAAGT 
— — : Repiicase 1a — . — 

TTTGTTCTTGACATGCGTTCTTATGAAAGGCTGATAAATACTATTTCACCTGAGAAACTTAAGAATTATGCTGCAAGTTATAATAAATATA^ 
" ' 1 ' ' ' ' 1 ' ' 1 1 I » « . ■ i i . ■ i i . » j,...,. . ...... , , , , J { , ^ 

AAACAAGAACTGTACGCAAGAATACTTTCCGACTATTTATGATAAAGTGGACTCTTTGAATTCTTAATACGACGTTCAATATTATTTATATT 

F V L D M R S Y ERLINTISPEKLKNYAASYNKY* 
Replicase 1a — 



1 1 1 ' 1 ' ' 1 ' ' 1 1 1 ' H ' ■ ■ ■ » ■ » ■ ■ l ■ ■ ■ ■ i ■ ■ ■ ■ » . ■ ■ ■ i ■ ■ ■ ■ f | | , | , , t , , , , 

TATAATATCACCATCACGATCACTCCGACTAATAGCAACACGAACAATACGAGTAAATCGGTTCCGATACAATCTAATGCGTTTTCTAGTAT 

Y Y s 6 s A SEADYRCACYAHLAKAMLDYAKDH 
— — -Replicase 1a ■ — 



ATGACATGTTATATTCTCCACCTACCATTAGCTACAATTCCACCTTACAATCTGGTCTTAAGAAGATGGCACAACCATCTGGTTGTGTTGAG 
' 1 ' 1 ' ' ' 1 t ■■■ | ■ ■». i .... | ■ ■ 1 > 1 1 1 1 I ■ 1 1 1 i ' i ' i | ' i i i i i i i i | | ■ ... t , ... | 

TACTGTACAATATAAGAGGTGGATGGTAATCGATGTTAAGGTGGAATGTTAGACCAGAATTCTTCTACCGTGTTGGTAGACCAACACAACTC 



N D H L Y S P PT I SYNSTL OSGLKKMAGPSGCVE 
— Replicase 1a ■ — — 



01 December 2003 11:52 1 ° t Pat 
EMCR-CoV MPD (1 > 27532) Site and Sequence - 



AGATGTGTGGTTCGCGTCTGTTATGGTAGTACTGT6CTTAATGGAGTTTGGTTAGGTGACACTGTTACTTGTCCTAGACATGTCATAGCACC 

-j I I I 1 1 1 ' 1 11 11 I I , ... i ■ ... | 1 | .... i i ... | 

TCTACACACCAAGCGCAGACAATACCATCATGACACGAATTACCTCAAACCAATCCACTGTGACAATGAACAGGATCTGTACAGTATCGTGG 



RCVVRVCYGSTVLNGVWLGDTVTCPRHV I AP 
_ Replicase 1a 

ATCAACCACTGTTCTTATTGATTATGATCATGCATATAGTACTATGCGTTTGCATAATTTTTCAGTGTCTCATAATGGTGTCTTCTTGGGAG 

, , , , j i | i i i - l . i i I ■ ■ ■ ■ | ■ • 1 1 1 1 1 I 1 1 1 1 1 ' ' ' ' 1 I i . ■ i 1 i i i ■ I i i ■ i I ■ ■ ■ ■ 1 ■ i ■ ■ I ■ i ■' I ' — 9,: 

TAGTTGGTGACAAGAATAACTAATACTAGTACGTATATCATGATACGCAAACGTATTAAAAAGTCACAGAGTATTACCACAGAAGAACCCTC 

STTVL I DYDHAYSTMRLHNFSVSHNGVF L G 

Replicase 1a 



TTGTTGGTGTTACAATGCATGGTTCTGTGTTGCGTATTAAGGTTTCACAATCTAATGTACATACACCT AAACATGTTTTTAAAACGTTGAAA 

,,,,,,, I i i [...■■ ■ ''' t ' i I . i . ■ i ■ ■ ■ i I ■ ■ ■ . i ■ i i . I ■ ■ . • g 

AACAACCACAATGTTACGTACCAAGACACAACGCATAATTCCAAAGTGTTAGATTACATGTATGTGGATTTGTACAAAAATTTTGCAACTTT 

VVGVTMHGSVLR I KVSQSNVHTPK H V F K T L K 
. — Replicase 1 a — — 

rr.TRGTGCTTCTTTTAATATTTTAGCATGTTATGAAGGTATTGCATCTGGTGTTTTTGGTGTTAATTTACGTACAA ACTTTACTATTAAAGG 

, , | ! ■ | ■ ■ i i i ■ ■ 1 I | .... i ■ ... | I 1 9 

GGACCACGAAGAAAATTATAAAATCGTACAATACTTCCATAACGTAGACCACAAAAACCACAATTAAATGCATGTTTGAAATGATAATTTCC 

PGASFN I LACYEG IASGVFGV NLRTNFT IKG 
— Replicase 1a— 



TTCTTTTATAAATGGAGCTTGTGGTTCTCCTGGTTATAATGTTAGAAATGATGGTACTGTTGAGTTTTGTTAT TTACACCAAATTGAGTTAG 

. i i i ■ ■ ■ , ■■■ ■ | ■ . .,,,.] , ■ i i .,, i . ... i ■ ■ ■ i ■ I | i i i ■ i ■ i . . I | 9 

AAGAAAATATTTACCTCGAACACCAAGAGGACCAATATTACAATCTTTACTACCATGACAACTCAAAACAATAAATGTGGTTTAACTCAATC 

SF j NGACGSPGY NVRNDGTV.EFC Y L H Q I EL 
- — Replicase 1 a " 



GTAGTGG TGCTCATGTTGGTTCTGATTTTACTGGTAGTGTTTATGGTAATTTTGATGACCAACCTAGTTTGCAAGTTGAGAGTGCCAAC _ 
CATCACCACGAGTACAACCAAGACTAAAATGACCATCACAAATACCATTAAAACTACTGGTTGGATCAAACGTTCAACTCTCACGGTTGGAA 

GSGAHVGSOFTGSVYGNFDDQPSL Q V E S A N L 
__ ■— Replicase 1 a — — ■ 



ATGCTATCAGATAATGTTGTTGCCTTTTTGTATGCTGCTTTGTTGAATGGTTGTAGGTGGTGGTTGCGTTCAACTAGAGTTAATGTTGATGG 

I | . . i i . . • ■ I ■ ■ ■ ■ i ■ ■ ■ ■ I ■ ' ' ' i ' ' ' ■ I ' ' ■ ' ' ' ' 1 1 1 1 ' ' 1 ' ' 1 ' 1 I 1 ' ' ' 1 ' ' ' ' ' ' ' ' ' 1 ,1 r 

TACGATAGTCTATTACAACAACGGAAAAACATACGACGAAACAACTTACCAACATCCACCACCAACGCAAGTTGATCTCAATTACAACTACC 



MLSDNVVAFLYAALLNGCRWWLR STRVNVDG 
Replicase 1a ~ ~ 



TTTTAAT6AATGGGCTATGGCTAATGGTTATACAATTGTTTCTAGTGTTGAGTGCTATTCTATTTTGGCAGCAAAAACTGGTGTTAGTGTTG 

,. , i ■ , i ■ I | I I | ■ ... i ■ ... | | .... i .... | | ■ ■ . . 

AAAATTACTTACCCGATACCGATTACCAATATGTTAACAAAGATCACAACTCACGATAAGATAAAACCGTCGTTTTTGACCACAATCACAAC 

FNEWAMANGYT I VSSVECYS 1 LAAKTGVSV 
— Replicase 1 a ■ — — 



AACAATTGTTAGCTTCCATTCAACATCTTCATGAAGGTTTTGGTGGTAAAAACATACTTGGTTATTCTAGTTTATGTGATGAGTTCACACTA 

i , , i , I | ■ i i ■ 1 ■ ■ . . i i . i i I ' i ' i i ' ■ ' ' I ' ■ ' ' ' ' ' ' ' I ' ' 1 1 ' ' ' 1 1 I ' 1 

TTGTTAACAATCGAAGGTAAGTTGTAGAAGTACTTCCAAAACCACCATTTTTGTATGAACCAATAAGATCAAATACACTACTCAAGTGTGAT 

_ - - > i, ti i i nvQQi.rnrrTi 



GCTGAAGTT GTGAAGCAGATGTATGGTGTTAACTTGCAAAGTGGTAAGQTTATTTTTGGTTTAAAAACAATGTTTTTATTTAGCGTTTTrT 

1 1 ' 1 ' 1 ' ' 1 ' ' 1 1 1 ' 1 1 1 1 1 1 1 1 1 1 1 ' 1 1 1 1 1 1 1 ' ' 1 1 1 1 ■ i i ■ ■ ■ * i - - - - 1 t - • - 1 1 1 1 1 

CGACTTCAACACTTCGTCTACATACCACAATTGAACGTTTCACCATTCCAATAAAAACCAAATTTTTGTTACAAAAATAAATCGCAAAAGA, 

AEVVKQMYGVNLQSGKV I FGLK TMFLFSVF 
— " ■ Replicas© 1a — 



CACAATGTTTT6GGCAGAACTCTTTATTTATACAAACACTATATGGATAAACCCT6TTATACTTACACCTATATTTT6TTTACTTTT6TTT" 
H ' ' 1 ' 1 ' 1 1 1 I | .... i .... | ...» i .... | | [ ,,,, , ... | | • • | 

GTGTTACAAAACCCGTCTTGAGAAATAAATATGTTTGTGATATACCTATTTGGGACAATATGAATGTGGATATAAAACAAATGAAAACAAA^ 

TMFWAELF I YTNT I W I NPV I LTP I FCLLLF 
RepHcase 1a — ■ ■ 



TGTCATTAGTTTTAACTATGTTTCTTAAACATAAGTTTTTGTTTTTGCAAGTATTTTTATTACCTACTGTTATTGCAACTGCTTTATATAA* 
' ' ' 1 1 ' ' " I 1 1 1 " 1 ' 1 1 1 1 1 ' ' ' ' 1 ' 1 I 1 1 1 1 ' 1 1 1 1 1 1 1 ' ' i ' ' ■ ■ 1 | « ... i .... i ■ ■ i i i i i i | i i i i i i i , i | , 

ACAGTAATCAAAATTGATACAAAGAATTTGTATTCAAAAACAAAAACGTTCATAAAAATAATGGATGACAATAACGTTGACGAAATATATT/ 

I S L VLTMFLKHKFLFLQVFLLPTVIATALYN 
— — RepHcase 1 a 

TGTGTTTTGGATTATTACATAGTAAAATTTTTGGCTGACCATTTTAACTATAATGTTTCAGTATTACAAATGGATGTTCAGGGTTTAGTTA/ 
H I ■ I ■ ■ ■ ■ i ■ » ■ ■ \ j .... i .... i .... i | ■ ... j .... i ,,,, ) ,,, , 

ACACAAAACCTAATAATGTATCATTTTAAAAACCGACTGGTAAAATTGATATTACAAAGTCATAATGTTTACCTACAAGTCCCAAATCAAT1 

CVLDYY I VKFLADHFNYNVSVLQMDVQGLVI 
• RepHcase 1a ■ 



TGTTTTGGTCTGTTTATTTGTTGTATTTTTACACACATGGCGTTTTTCTAAAGAACGTTTCACACATTGGTTTACATATGTGTGTTCTCTT/s 
1 11 1 1 » | .... i .... j I i .... 1 .... t .... | | t , | t | , 

ACAAAACCAGACAAATAAACAACATAAAAATGTGTGTACCGCAAAAAGATTTCTTGCAAAGTGTGTAACCAAATGTATACACACAAGAGAAT 

V I VCLFVVFLHTWRFSKERF. TH WFTYV CSL 
— RepHcase 1a- — . 



TAGCAGTTGCTTACACTTATTTTTATAGTGGTGACTTTTTGAGTTTGCTTGTTATGTTTTTATGTGCTATATCTAGTGATTGGTACATTGGT 
1 ' ' I ' ' 1 ' 1 ' 1 1 ■ I 1 ' ' ' i 1 1 1 1 I I .... i .... i .... i i ■ . ■ . | . ■ ■ . . . . . | . . . , . | - . ' ■ i i i i . 

ATCGTCAACGAATGTGAATAAAAATATCACCACTGAAAAACTCAAACGAACAATACAAAAATACACGATATAGATCACTAACCATGTAACCA 

I AVAY TYF Y S GO FLSL-LVMF LC A ISSOWY I G 
— RepHcase 1a — „ 



GCCATTGTTTTTAGGTTGTCACGTTTGATTATATTTTTTTCACCTGAAAGTGTATTTAGTGTTTTTGGTGATGTGAAACTCACTTTAGTTGT 
" I ' ' 1 ' 1 ' 1 1 1 I ' ' ' ' 1 ' 1 1 ' I 1 ' ' 1 ' 1 1 ■ ■ 1 1 ■ ■ ■ » ' ' ■ ■ 1 ■ ■ ■ ■ » i .... | , ... i | • i ■ , | , , j 

CGGTAACAAAAATCCAACAGTGCAAACTAATATAAAAAAAGTGGACTTTCACATAAATCACAAAAACCACTACACTTTGAGTGAAATCAACA 

A I V F RLSRLI IFFSPESVFSVFGDVKLTLVV 
— — — RepHcase 1a — 



- - 1 1 ' 1 1 ' 1 1 ■ ■ ■ ■ i ■ ■ ■ ■ I ■ ■ ■ ■ i ■ ■ ■ ■ I H | » » . i i i i . . | . i i i i i i i i | i t . 

AATAAATTAAACACCAATAAATCAAACATGAATAACCCCGTAAAACATAACCAAGTTATCCAAAAAATTTACATGATACCCACAAATACTAA 

Y I I C GYLVCTYWG I LYWFN R F F K C T M G-V Y D 
-RepHcase 1a . 



TTAAGGTGAGTGCTGCTGAATTTAAATACATGGTTGCTAATGGACTTCATGCACCATATGGACCTTTTGATGCACTTTGGTTATCATTCAAA 
' ' 1 ' ' ' ' 1 " ' 1 ' ' ■ 1 ■ ■ " i ■ ■ " 1 ■ » ■ ■ I | . ■ . . i . . ■ 1 1 1 1 1 i i i | i i i i i i > i i | i i i i i > . i i | i i i 

AATTCCACTCACGACGACTTAAATTTATGTACCAACGATTACCTGAAGTACGTGGTATACCTGGAAAACTACGTGAAACCAATAGTAAGTTT 

F K V S A A EFKYMVANGLHAPYGPFDALWLSFK 
— — " RepHcase 1a __ — , 



TTACTTGGTATTGGTGGTGACCGTTGTATAAAAATTTCAACTGTCCAATCCAAACTGACTGATTTGA AGTGTACTAATGTTGTGTTATTGGG 

I . , . , i , i i i | i ■ i i i i ■ ■ I ' ■ ■ ' i ' ' ' ■ 1 1 1 1 1 ' ' ' I 1 ' 1 1 1 ' ' ' ' 1 1 1 ' 1 

AATGAACCATAACCACCACTGGCAACATATTTTTAAAGTTGACAGGTTAGGTTTGACTGACTAAACTTCACATGATTACAACACAATAACCC 

L L G I G G D R C I K I S T V Q S KLTOLKCTNVVLLG 
— Replicase 1a- ~~ 



TTGTTTGTCTAG TATGAACATTGCAGCTAATTCTAGTGAATGGGCTTATTG 

AACAAACAGATCATACTTGTAACGTCGATTAAGATCACTTACCCGAATAACACAACTAAATGTGTTATTCTAATTAGAAACACTACTGGGTC 

CLSSMN I AANSSEWAY C VDL HNK I NL C D DP 
-Replicase 1a — 



AAAAAGCTCAAGGTATGTTGTTAGCACTCCTTGCGTTCTTTCTAAGTAAACATA GTGATTTTGGTCTTGATGGCCTTATTGATTCTTATTTT 

. , , , , t , , , , I , , . i \ i I ■ ■ I I I i I I I ' | 1 1 1 ' ' ' ' 1 1 I ' ' ' ' 1 ' ' ' ' 1 ' ' ' ' 1 ' ' ' ' ' ' ' ' ' 1 ' ' ' ' ' ' ' ' ' ' ' ' ' f 

I * _ _ . . _ „.-^~«.~-r-A k a */-*.rvA<-»* Ar^TAf^fevoAATAAfTAAf^AATAAAA 

TTTTTCGAGTTCCATACAACAATCGTGAG 



1 111 ' " GAACGCAAGAAAGATTCATTTGTATCACTAAAACCAGAACTACCGGAATAACTAAGAATAAAA 



E K A 0 G M L L A L L A F F ^S^ H S D F G L 0 G L , 0 S Y F 



GATAA TAGTAGCACCCTGCAGAGTGTTGCTTCATCATTTGTTAGTATGCCATCATATATTGCTTATGAAAATGCTAGACAAGCTTATGAGGA 
CTATTATCATCGTGGGACGTCTCACAACGAAGTAGTAAACAATCATACGGTAGTATATAACGAATACTTTTACGATCTGTTCGAATACTCCT 

DM88TLQ3VA88FV 3 MP S Y 1 A Y E N A R Q A Y E D 
' Replicase 1a 



TGCTATTGCTAAT GGATCTTCTTCTCAACTTATTAAACAATTGAAGCGTGCCATGAATATCGCAAAGTCTGAATTTGATCATGAGATATCTG 
ACGATAAiGATTACCTAGAAGAAGAGTTGAATAATTTGTTAACTTCGCACGGTACTTATAGCGTTTCAGACTTAAACTAGTACTCTATAGAC 

AIANGSSSQLI K Q L K R , A H N I A K S E F D H E I 3 
Replicase 1a — 



TTCAGAAGAAA ATTAATAGAATGGCTGAACAAGCTGCTACTCAGATGTATAAAGAAGCACGCTCTGTTAATAGAAAATCTAAAGTTATTAGT 
AAGTCTTCTTTTAATTATCTTACCGACTTGTTCGACGATGAGTCTACATATTTCTTCGTGCGAGACAATTATCTTTTAGATTTCAATAATCA 

VQKK I NRMAEQAAT Q MY KEARSVNRKSKV I S 
— — ■ — Replicase 1a^ ^ 



GCTATG CACTCTTTACTTTTTGGAATGTTAAGACGTTTGGATATGTCTAGTGTTGAAACTGTTTTGAATTTAGCACGTGATGGTGTTGTGCC 
CGATACGTGAGAAATGAAAAACCTTACAATTCTGCAAACCTATACAGATCACAACTTTGACAAAACTTAAATCGTGCACTACCACAACACGG 

AMHS LLFGMLRRLDMSSVE TVLNLARDGVVP 
— Replicase 1a — — — 



ATTGTCAGTTATACCTGCAACTTCAGCTTCCAAACTAACTATTGTTAGTCCAGATCTTGAATCT TATTCTAAGATTGTTTGTGATGGTTCTG 

| | . . i i I ■ i ' ■ I I ' ' ' ' I ' ' ' 1 I 1 | i . . ■ I ■ . ■ - r 



TAACAGTCAATATGGACGTTGAAGTCGAAGGTTTGATTGATAACAATCAGGTCTAGAACTTAGAATAAGATTCTAACAAACACTACCAAGAC 

LSVIPATSASKLT IVSPOL ESYSK I VCDGS 
■ — Replicase 1a 



TTr.ATTATGCTGGAGTTGTTTGGACACTTAATGATGTTAAAGACAATGATGGTAGACCTGTTC ATGTTAAAGAGATTACAAGGGAGAATGTT 

... I ■ . ■ ■ I . . I I I > .1 I I I I I I I » ■ I I 1 » ■ ■ ■ I ' ' ' ' I 1 ' ' 1 1 ' 1 ' ' 1 ' ' * ' 1 ' ' ' ' 1 ' ' ' * ' ' ' ' ' 1 



. , . . | I I I I | I I l I | I I I I | ' 1 T I 1 ' ' ' ' | ■ ■ ■ ■ | ■ ■ ■ ■ | ■ ■ » « I . ■ ■ ■ | ■ ■ ■ ■ I I I ■ ■ 1 

AAGTAATACGACCTCAACAAACCTGTGAATTACTACAATTTCTGTTACTACCATCTGGACAAGTACAATTTCTCTAATGTTCCCTCTTACAA 

VHYAGVVWTLNDVKONDGRPV HVKE I TRENV 
— — Replicase 1a— — : " 



EMCR-CoV.MPD (1 > 27532) Site and Seg nanre 1 



^TTTGACATGGCCTCTTATCCTTAATTGTGAACGTGTT^ 

CTTTGAAACTGTACCGGAGAATAGGAATTAACACTTGCACAACAATTTGAAGlTTTAlTACTTTAATlcGGACCATTlGAATicGTTiTT. 
ETLTWPLILNCER V Re V ic K se L QNNE I MPGKLKQK 



TATGAAAGCTGAGGGjGATGGTGGTGTTTTAGGTGATGGTAATGCTTTGTATAATACTGAGGGTGGTAAAACTTTTATGT ATGCTTATAT- 

ATACTTTCGACTCCCACTACCACCACAAAATCCACTACCATTACGAAACATATTAiGACTCCCAciATTTTGAAAATACAlACGAATATA^ 

"KAEGDGGVLG DGNALYNTEGGKTFMYAV r 
■ Replicase 1 a h n Y A Y 1 

CTAATAAAGCTGACCTTAAATTTGTTAAGTGGGAGTATGAGGGTGGTTGCAACACAATCGAGTTAGACTCTCCTTGTCG ATTTATGGTCG/ 
GATTATTTCGACTGGAATTTAAACAATTCACCCTCATACTCCCACCAACGTTGTGTTAGCTCAATCTGAGAGGAACAGCTAAATACCAG^ 
SNKADLKFVK WEYE J^. N TIELDSPCRFMVt 

ACACCTAATGGTCCTCAAGTGAAGTATTTGTATTTTGTTAAAAATTTAAATACCTTACGTAGAGGTGCCGTTCTTGGTT TTATAGGTGCCA 
TGTGGATTACCAGGAGTTCACTTCATAAACATAAAACAATTTTTAAATTTATGGAATGCATCTCCACGGCAAGAACCAAAATATCCACGGT 

T P N G P Q V K Y L Y F V * H L N T L RRGAVLGF I GA 
— — Replicase 1a b fl 

AATTCGTCTACAAGCTGGTAAACAAACTGAATTGGCTGTTAATTCTGGACTTTTAACTGCTTGTGCTTTTTCTQTTGAT CCAGCAACC 
TTAAGCAGATGTTCGACCAiTTGTTTGACTTAACCGACAATTAAGACCTGAAAATTGACGAACACGAAAMGACAACTAGGTCGiTGGTGA. 

I RLQAGKQTELAV N SGLLTA-CAFSVDPATT 
— -Replicase 1a 11 

ACTTGGAAGCTGTTAAACATGGTGCAAAACCTGTAAGTAA TTGTATTAAGATGTTATCTAATGGTGCTGGTAATGGTCAAGCTATAAnAAr- 
' 1 I 1 ■ , , 1 1 . ■ , , 1 ■ , . 1 1 1 , 1 ■ . , , 1 , , . 



TGAACCTTCGACAATTTGTACCACGTTTTGGACATTCATTAACATAATTCTACAATAGATTACCACGACCATTACCAGTTCGATATTGTTG/ 

Re P ,icase 1 a " u * " * A M ' " B Q A 1 T T 



Y L E A V KHGAKPVSN C I KMLSNGAGNGQA I T 



AGTGTAGATGCTAACACCAATCAAGATTCTTATGGTGGAGCGTCTATTTGTTTGTATTGTCGGGCCCACGTTCCTCACCCTAGTAT GGATGE 
TCACAicTACGATTGiGGTTAGTTciAAGAATACCACCTCGCAGAlAAACAAACAlAACAGCCCGGGTGCAAGGAGTGGGATCATACCTACC 
SVDANTNQDSYGG A ^S^ C LYCRAHVPHPSMDl 

:I TA P G .T AA . G U.TAAGG_GTAAATGTGTTCAG^ 

AATGACATTCAAATTCCCATTTACACAAGTCCAAGGATAACCAACAAACCTAGGATAATCCAAAACAAATCTTTTATTACACACATTACAAA 



Y C K F K G K C V ° V P ' RepScaie 1a—- • rcFCLE NNVCNV 



01 December 2003 11:52 l<© Pa ! 
EMCR-CoV.MPD (1 > 27532) Site and Sequence _ _ __ 

GTGGTT GTTGGTTGGGACACGGGTGTGCTTGTGATCGTACAACCATTCAAAGTGTTGACATTTCTTATTTAAACGAGCAAGGQGTTC^ 

CACCAACAACCAACCCTGTGCCCACACGAACACTAGCATGTTGGTAAGTTTCACAACTGTAAAGAATAAATTTGCTCGTTCCCCAAGATCAC 

CGCWLGHGCACDRT TIQS VP ISYLNEQGVLV 

. — Replica.se 1a ■ 

R A R G S S 
' Replicase 1b 



CAGCTCG ACTAGAACCCTGTAATGGCACGGACATCGATAAGTGTGTTCGTGCTTTTGACATTTATAATAAAAATGTTTCATTCTTGGGTAAG ^ 
GTCGAGCTGATCTTGGGACATTACCGTGCCTGTAGCTATTCACACAAGCACGAAAACTGTAAATATTATTTTTACAAAGTAAGAACCCATTC 

0. L D . j 

-Replicasela n q y Q , Q R c y R AFP 1YNKNVSFLGK 
— Replicase 1 b 



TGTTT GAAGATGAACTGTGTTCGTTTTAAAAATGCTGATCTTAAGGATGGTTATTTTGTTATAAAGAGGTGTACTAAGTCGGTTATGGAACA 
ACAAACTTCTACTTGACACAAGCAAAATTTTTACGACTAGAATTCCTACCAATAAAACAATATTTCTCCACATGATTCAGCCAATACCTTGT 

CLKMNCVRFKNADL K D Q YFVIKRCTK3VHEH 
! — Replicase 1b- 

CGAGCAAT CCATGTATAACCTACTTAACTTTTCTGGTGCTTTGGCTGAGCATGATTTCTTTACTTGGAAAGATGGCAGAGTCATTTATGGTA 
GCTCGTTAGGTACATATTGGATGAATTGAAAAGACCACGAAACCGACTCGTACTAAAGAAATGAACCTTTCTACCGTCTCAGTAAATACCAT 

EQSMYN LLNFSGALAE H DFFTWKDGRV I YG 
— Replicase 1b _ ; 



ATGTTAGTAGACATAATCTTACTAAATATACTATGATGGACTTGGTTTATGCTATGCGTAACTTTGATGAACAAAATTGTGATGTTCTAAAA 
TACAATCATCTGTATTAGAATGATTTATATGATACTACCTGAACCAAATACGATACGCATTGAAACTACTTGTTTTAACACTACAAGATTTT 

mvsrhnlt'kytuhdlv y„ amrnfdeqncdvlk 

w Replicase 1b 

RAAGTATTAGTTTTAACTGGTTGTTGTGACAATTCTTATTTTGATAGTAAGGGTTGGTATGA CCCAGTTGAAAATGAAGATATACATAGAGT 

I ■ , I , , t I I , I I , I ■ ■ ■ I .. I I I I ■ I . I I I I I I I ■ 1 I I ' 1 ' 1 I 1 ' 11 I 1 11 I 11 11 1 11 11 I 11 11 1 11 1 1 I 
CTTCATAATCAAAATTGACCAACAACACTGTTAAGAATAAAACTATCATTCCCAACCATACTGGGTCAACTTTTAC-TTCTATATGTATCTCA 

evlvltgcconsyfdskgwydp venedihrv 
b v ! Replicase 1b " 



TTATGCATCTCTTGGCAAAATTGTAGCTAGAQCTATGCTTAAATGCGTTGCTCTATGTGA TGCGATGGTTGCTAAAGGTGTTGTTGGTQTTT 

, , , , , I , , I I , I I I 1 1 I ■ ■ I ■ ■ 1 - ■ ■ - I I I I ■ ■ ■ ■ I I I I I 1 1 .. I I I I I I I I 11,11 

AATACGTAGAGAACCGTTTTAACATCGATCTCGATACGAATTTACGCAACGAGATACACTACGCTACCAACGATTTCCACAACAACCACAAA 

YASLGK J VARAMLKCVALC D AMVAKGVVGV 
Replicase 1b — — 



TAACATTAGATAACCAAGATCTTAATGGTAACTTTTATGATTTTGGTGATTTTG TTGTTAGCTTACCTAATATGGGTGTTCCCTGTTGTACA 

i i , , , , i . , . . i . . . 1 1 . i . ■ i ' ' i • 1 ' ' \ TTzrrrrTT 



ATTGTAATCTATTGGTTCTAGAATTACCATTGAAAATACTAAAACCACTAAAACAACAATCGAATGGATTATACCCACAAGGGACAACATGT 

LTLDNQDLNGNFYDFGD F vvslpnmgvpcct 
: Replicase 1b ■ 



01 December 2003 11:52 

EMCR-CoV.MPP (1 > 27532) Site and Sequence 



TCATATTATTCTTATATGATGCCTATTATGGGTTTAACTAATTGTTTAGCTAGTGAGTGTTTTGTCAAGAGTGATATTTTTGGTAGTGAT1 
1 1 ' 1 1 > ■ ■ ■ i ■ ■ ■ . ) . . , , i , , , . 1 i i i ■ » i ■ . . ■ | 1 ' ■ ' i i > < i i | i « > i i i i i i j , i , , | 

AGTATAATAAGAATATACTACGGATAATACCCAAATTGATTAACAAATCGATCACTCACAAAACAGTTCTCACTATAAAAACCATCACTA/ 

SY Y SYMMP IMGLTNCLASECFVKSD I FGSD 
— Replicase 1b — . 



TAAAACTTTTGATTTGCTTAAGTATGATTTCACTGAACATAAAGAAAATTTATTCAATAAGTACTTTAAGCATTGGAGTTTTGATTATCAT 

" 1 1 | » ... I .. . . | ».. i | .... | 1 ' ' ■ 1 M ■ i ■ | i i ■ i 1 i i i i | i . , , | , , , , | t ,, , , I ii,,!,, 

ATTTTGAAAACTAAACGAATTCATACTAAAGTGACTTGTATTTCTTTTAAATAAGTTATTCATGAAATTCGTAACCTCAAAACTAATAGTA 

* T FDLLKYDFTEHKENLFNKYFKHWSFDYH 
' — — Replicase 1b— 

CTAATTGTAGTGACTGTTATGATGATATGTGTGTTATACATTGTGCTAATTTTAATACACTATTTGCCACAACTATACCAGGTACTGCTTT 

' ' 1 " ' 1 ' 1 ■ 1 i 1 ■ 1 1 1 1 1 1 1 1 1 ■ 1 ' 1 ' ■ ■ ■ i ■ ■ ■ ■ ■ ' ■ ■ ■ i ■ ■ » ■ i ■ ■ ■ ■ i « ■ ■ ■ i ■ . ■ ■ i ■ ■ ■ ■ » ■ ■ ■ . i . ■ ■ , i , , , ■ i , , ■ , t , , , , 

GATTAACATCACTGACAATACTACTATACACACAATATGTAACACGATTAAAATTATGTGATAAACGGTGTTGATATGGTCCATGACGAAA 

PNCSDCYDDMCVIHCANFNTLFATTIPGTAF 
— Replicase 1b— , 

GGTCCACTATGTCGTAAAGTTTTTATAGATGGTGTTCCACTTGTTACAACTGCTGGTTATCATTTTAAGCAATTAGGTTTGGTTTGGAATA 

' I I ■»■> I ■ ■■■ I .... I .... 1 i ■■■ ) ■,.. ! ,,■,!■,,, ! ,,,, I ( H 

CCAGGTGATACAGCATTTCAAAAATATCTACCACAAGGTGAACAATGTTGACGACCAATAGTAAAATTCGTTAATCCAAACCAAACCTTAT 

GPLCRKVF I DGVPLVTTAGYHFKGLGLVWN 
— ■ — Replicase 1b — — - , 



AGATGTTAACACACACTCAGTTAGGTTGACAATCACTGAACTTTTGCAATTTGTTACTGACCCTTCCTTGATAATAGCTTCTTCTCCAGCA 

' ' 1 1 ' ' 1 1 ' 1 ' ' 1 ' 1 ' 1 1 ' ' 1 ' ' I ' ' ' ' i ' 1 ' ' I ' ' ' H ■ « ■ ■ | ■ ■ j j I ■ ■ ■ . | ■ ■ ■ . I ■ , ■ ■ | . , . , i , , , , | , , , , ! , , , , ) , , ,. 

TCTACAATTGTGTGTGAGTCAATCCAACTGTTAGTGACTTGAAAACGTTAAACAATGACTGGGAAGGAACTATTATCGAAGAAGAGGTCGT 

D VNTH SVRLT I T.ELLQFVTDPSL I IASSPA 
Replicase 1b ■ : — 



TCGTTGATCAACGCACTATTTGTTTTTCTGTTGCAGCATTGAGTACTGGTTTGACAAATCAAGTTGTTAAGCCAGGTCATTTTAATGAAGAI 
1 ' ' ' ' » I '■■' !" »' I >>■■ ! .■.. I ... . i .... 1 ■ ■ , , i . . , ■ i , , , , i , , , , ! , , , , t ,.,, ! ■■■, ) ,,,, i .,, , , 

AGCAACTAGTTGCGTGATAAACAAAAAGACAACGTCGTAACTCATGACCAAACTGTTTAGTTCAACAATTCGGTCCAGTAAAATTACTTCTI 

L V D Q R T I CFSV'AALSTGLTNQVVKPGHFNEE 
Replicase 1b—; — — 

TTTTATAACTTTCTTCGTTTAAGAGGTTTCTTTGATGAAGGTTCTGAACTTACATTAAAACATTTCTTCTTCGCACAGAATGGTGATGCTGt 

' ' ' ' ' ' ' ' 1 1 ' ' ' 1 ' ' ' ' 1 ' ' ' ' I 1 ' 1 1 ' ' 1 ' ' I ' 1 1 1 ' ' ' ' ' I I ■ ' i ■ I ■ ■ ■ ■ | ■ . ■ . i ■ . ■ ■ I ■ , ■ , l , ■ . , | , , , , t , 

AAAATATTGAAAGAAGCAAATTCTCCAAAGAAACTACTTCCAAGACTTGAATGTAATTTTGTAAAGAAGAAGCGTGTCTTACCACTACGACC 

FYNFLRLR G FFDEGSELTLKHFFFAQNGDA 
Replicase 1b — 



- ' ' " ' ' ' ' ' ' ' 1 ' ' ' ' 1 ' ' ' ' 1 ' ' ' ' 1 ' ' ' ' 1 ' ' ' ' I ' ' ' 1 » ' ' 1 ' I ' ' ' ' ' ' ' ' ' I | I I I | I I I I | I I I t-J 

ACAATTTCTAAAACTGAAAATGGCAATATTATTCGGATGGTAAAATCTATAAACAGTTCGATCTCAATGTATATTCTATCAGAGAGCAATA/ 

VKDFOFYR YNKPT ILD ICQARVTYK IVSRY 
— — Replicase 1b . 



TTGACATTTATGAAGGTGGCTGTATTAAGGCATGTGAAGTTGTTGTAACAAATCTTAATAAGAGTGCTGGTTGGCCATTAAATAAGTTTGGT 
I ■ ■ . . I . ■ , > I ■ ■ i I I i i i T i i i t | ! . , l ) i ■ ■ , , | _ 

AACTGTAAATACTTCCACCGACATAATTCCGTACACTTCAACAACATTGTTTAGAATTATTCTCACGACCAACCGGTAATTTATTCAAACC^ 

F D 1 Y E G G C i KACEVVVTNLNKSAGWPLNKFG 
— Replicase 1b — — _ 



01 December 2003 11:52 ' 

EMCR-CoV.MPn (1 > 27532) Site and Sequence 

AAAGCTAGTT TGTATTACGAATCTATATCTTATGAAGAACAGGATGCTTT^ ^ 

ttIcgatJaaacataatgcttagatatagaatacttcItgtcctacgaaacaaacgaaactgtttcgcattacaggagggatgatacIgtgt 

KASLYYES I SYEEQDA L FALTKRNVLPTMTO 
K A b L 1 -Replicase 1b — 

G CTGAATCTTAAGTATGCTATTAGTGGTAAAGAACGTG^ 1 
CGACTlAGAAiTCATACGATAATCAicATTTCTTGiACGAicTTGACAACCACCACAAAGAGACAACAGGTGTTACTGGTGTTCTGTTATGG 

LNLKYA1SGKERARTV G GVSLLSTM T T R QY 
L N L K T A . Replicase 1 b 

AT CAAAAACATCTTAAATCCATTGTTAATACACGCAATG^ . 
TAGTTTTTGTAGAATTTAGGTAACAATTATGTGCGTTACGGTGACAACAATAACCATGATGGTTTAAAATACCACCAACCTTATTATACAAC 

HQKHLKS I VNTRN A M^V J STTKFYGGWNNML 

CGTAC TTTAATTGATGGTGTTGAAAACCCTATGCTCATGGGTTGGGATTATCCCAAATGTGATAGAGCTTTGCCTAACATGATACGTATGAT 
GiATGAAATTAicTACCACAACTTTTGGGATACGAGiACCCAACCciAATAGGGTTTACACTATCTCGAAACGGATTGTACTATGCATACTA 

R T L , 0 G V E N P . L M Y^ PKCDRALPNM I R M 



TTCAGCCATGGTGTTGGGTTCTAAGCATGTTAAT TGTTGTACTGTAACAGATAGGTTTTATAGGCTTGGTAACGAGTTGGCACAAGTTTTAA 

, , , , \ , , ■ . | ■ ■ ■ I I ■ > ' ' I ' ' ' 1 I 1 1 ' ' 1 ' ' ' ' 1 ' ' ' ' 1 ' ' ' ' 1 ' ' ' 1 * 5 

AAGTCGGTACCACAACCCAAGATTCGTACAATTAACAAi 



1 ' 1 ' ' 111,111 1 ~ ! ~" I cATGACATTGTCTATCCAAAATATCCGAACCATTGCTCAACCGTGTTCAAAATT 



SAMVLG5 KHVNCC T^T 

CAGA AGTTGTTTATTCTAATGGTGGTTTTTATTTTAAGCCAGGTGGTACGACTTCTGGTGACGCTAGTACAGCTTATGCTAATTCTATTTTT 
' ' ' 1 ' ' cACCATGCTGAAGACCACTGCGATCATGTCGAATACGATTAAGATAAAAA 



GTCTTCAACAAATAAGATTACCACCAAAAATAAAATTCGGTC 

TEVVYSNGGFYFKP SGDASTAYAN ._J_F_ 



A ACATTTTTCAAGCCGTGAGTTCTAACATTAACAGGTTGCTTAGTGTCCCATCAGATTCATGTAATAATGTTAATGTTAGGGATCTACAACG 
iiGTAAAAAGiTCGGiACTCAAGATlGTAATTGTciAACGAATCACAGGGTAGTCTAAGTACATTATTACAATTACAATCCCTAGATGTTGC 

NIFOAVSSNINRL ^ , 0 » C M N V N V R D L 0 R 



AC GTCTGTATGATAATTGCTATAGGTTAACTAGTGTTGAAGAGTCATTCATTGATGATTATTATGGTTATCTTAGGAAACATTTTTCAATGA 
iGCAGACAlACTAnAACGATATicAATTGATCACAACTTCTCAGTAAGTAACTACTAATAATACCAATAGAATCCTTTGTAAAAAGTTACT 



R L Y 



DNCYRLTS V E E S F I DDYY GYLRKH FSM 
— — Replicase 1b — 

TGATTCTCTCTGATGACGGT GTTGTCTGTTATAACAAGGATTATGCT6AGT I ftw i iwiBiwtw^. ■ ■ ; ^ 



TTGT CTGTTATAACAAGGATTATGCTGAGTTAGGTTATATAGCAGACATTAGTGCTTTTAAAGCCACTTTG 
ACTAAGAGAGACTACTGCCACAACAGACAATATTGTicCTAATACGACTCAATCCAATATATCGTclGTAATCACGAAAATTTCGGTGAAAC 

H-, L » D D Q V V C Y N K G Y , A 0 . S A F K A T L 



ui uecember2003 11:52 

EMCR-C0V.MPD (1 > 27532) Site and Sequence 



TATTACCAGAATAATGTCTTTATGAGTACTTCTAAATGTT^ 

ATAATGGTCTTATTACAGAAATACTCATGAAGATTTACAACCCAACTTCTTCTAAATTGATTCCCTGGTGTACTCAAAACAAGGGTCGTA1 

YYQ NNVFMS T SKCWVEEDLTKGPHEFCSQH 
~~~ — — — Replicase 1b- __ 



TATGCAAATAGTTGATAAAGATGGTAC CTATTATTTGCCTTACCCAGATCCTAGTAGGATCTTGTCAGCTGGTGTTTTTGTTGATGATRT1 

'' ' '' '' ' ' ' ' ' ' '' '' I '' '' ' 11 11 I 1 ' ' 1 I ' ' ' ' 1 ■ ■ ■ I I ■ ■ ■ ■ I I ) I ■ I I 1 ! I I ! | I , , , I | , | 

ATACGTTTATCAACTATTTCTACCATGGATAATAAACGGAATGGGTCTAGGATCATCCTAGAACAGTCGACCACAAAAACAACTACTACAA 

"Q1VPKDGTYYLP Y PD PSRIL8A0VFVDDV 
— Replicase 1 b ■ . 

TTAAGACAGATGCTGTTGTT TTGTTAKAACGTTATGTGTCTTTAGCTATTGATGCATACCCTCTTTCAAAACACCCTAATTCTGAATATCG 

1 1 1 1 I i . i i I [ ... [ , ,, , [ , , , ,, 

AATTCTGTCTACGACAACAAAACAATMTTGCAATACACAGAAATCGATAACTACGTATGGGAGAAAGTTTTGTGGGATTAAGACTTATAGC 

V K T D A V V LL7RYVSLAI OAYPLSKHPNSEYF 
Replicase 1b — __ 

AAGGTTTTTTACGTATTACTTGATTGGGTTAAGCATCTTAACAAAAATTTGAATGAGGGTGTTCTTGAATCTTTTTCTGTTACACTTCTTG 
' ' ' ' 1 1 I ■ ■ ■ ■ i ■ ■ ■ . | . ■ i i i ■ , ■ , | | [ ■ . . i i i i i , ] ■ ., 

TTCCAAAAAATGCATAATGAACTAACCCAATTCGTAGAATTGTTTTTAAACTTACTCCCACAAGAACTTAGAAAAAGACAATGTGAAGAAC 
KVFYVLLDWVKHLNKNLNEGVLESFSVTLI 

Replicase 1b : L 



TAATCAAGAAGATAAGTTTTGGTGTGAAGATTTTTATGCTAGTATGTATGAAAATTCTACAATATTGCAAGCTGCTGGCTTATGTGTTGTT' 
' ' 1 1 I ■■■•■ » ' ■ ■ ■ | i ■ ■ i . ■ | i i , i , , , , 

ATTAGTTCTTCTATTCAAAACCACACTTCTAAAAATACGATCATACATACTTTTAAGATGTTATAACGTTCGACGACCGAATACACAACAA/ 

N Q E D K F WCEDFY.ASMYEN ST I LQAAG LC VV 
" — Replicase 1b : 



GTGGTTCACAAACTGTTCTTCGTTGTGGTGATTGTCTGCGTAAGCCTATGTTGTGCACTAAATGTGCATATGATCATGTATTT GGTACCGAC 
CACCAAGTGTTTGACAAGAAGCAACACCACTAACAGACGCATTCGGATACAACACGTGATTTACACGTATACTAGTACATAAACCATGGCTC 

c 6 s Q TVLRCGDCLRKPMLCTKCAYDHVFGTD 
■ — ■ Replicase 1b ■ _ 



CACAAGTTTATTTTG GCTATAACACCGTATGTATGTAATGCATCAGGTTGTGGTGTTAGTGATGTTAAAAAATTQTATCTTGGTGGTTTGAJ 
'''''''' 1 I .... i .... | i i ■■ i i i i i i i .... i | [ . . , ■ 

GTGTTCAAATAAAACCGATATTGTGGCATACATACATTACGTAGTCCAACACCACAATCACTACAATTTTTTAACATAGAACCACCAAACTT 



HI<FIL A 1 TPYVCNASGCGVSOVKKLYLGGLf 
— Replicase 1b-— _______ 



' ' ■ | i ■ [ , i | i | 

AATGATAACATGTTTAGTATTTGGTGTCAACAGAAAAGGTAATACAAGACGACCATTATATAAACCAAATATATTTTTAAGTCGTTGACCAA 

f Y c T N HKPQLSFPLCSA GNIFGLYKNSATG 
Replicase 1b — — 



CCTTAGATGTTGAAGTTTTTAATAGGCTTGCAACGTCTGATTGGACTGATGTTAGGGACTATAAACTTGCTAATGATGTTAAAGATACACTT 
11, 1 I . . .■ i ■ i . i | | i i ? , | . , . _ 

GGAATCTACAACTTCAAAAATTATCCGAACGTTGCAGACTAACCTGACTACAATCCCTGATATTTGAACGATTACTACAATTTCTATGTGAA 

S L D V E V F NRLATSDWTDVRDYKLANDVKDTL 
~~ Replicase 1b — 



01 December 2003 1 1 :52 * f « T Paj 
EMCR-CoV.MPD (1 > 27532) Site and Sequence 

AGACT CTTTGCGGCTGAAACTATTAAAGCTAAAGAAGAGAGTGTTAAGTCTTCT^ ^ 

TCTGAGAAACGCCGACTTTGATAATTTCGATTTCTTCTCTCACAATTCAGAAGAATACGAAAACGTTGAGAATTTCTCCAACAACCTGGATT 

RLFAAET 1 KAKEESV K S SYAFATLKEVVGP K 
— Replicase 1b 



AGAATT GCTTCTTAGTTGGGAAAGTGGTAAAGTTAAACCACCTTTGAATCGTAATTCTGTTTTCACCTGTTTTCAAATAAGTAAGGACTCAA ^ 
TCTTAACGAAGAATCAACCCTTTCACCATTTCAATTTGGTGGAAACTTAGCATTAAGACAAAAGTGGACAAAAGTTTATTCATTCCTGAGTT 

FLLLSWESGK VKPPLNRNSVFTCFQ I S K D S 

— Replicase 1b — ~~ 



AATTCCAAA TAGGTGAGTTCATCTTTGAAAAGGTTGAATATGGTTCTGATACTGTTACGTATAAGTCTACTGTAACCACTAAGTTAGTTCCT ^ 
TTAAGGTTTATCCACTCAAGTAGAAACTTTTCCAACTTATACCAAGACTATGACAATGCATATTCAGATGACATTGGTGATTCAATCAAGGA 

KFQIGEFIFE K V E Y G S D TVTYKSTVTTKLVP 
_ r " ' Replicase 1b — 

GGTAT GATTTTTGTCTTAACATCTCACAATGTTCAACCTTTACGTGCACCAACTATTGCAAACCAAGAGAAGTATTCTAGCATTTATAAATT ^ 
CCATACTAAAAACAGAATTGTAGAGTGTTACAAGTTGGAAATGCACGTGGTTGATAACGTTTGGTTCTCTTCATAAGATCGTAAATATTTAA 

pM i pvLTSHNVQPLRAPT I AN Q EKYSS I YKL 
Replicase 1 b ■ — 



GCACCCT GCTTTTAATGTCAGTGATGCATATGCTAATTTGGTTCCATATTACCAACTTATTGGTAAACAAAAGATAACTACAATACAGGGTC 

cgtgggacgaaaattacagtcacIacgtatacgattaaaccaaggtataatggttgaataaccatttgttttctattgatgttAtgtcccag 



HPAFNVSDAYANLVPY Y QL I.*G K Q K I T T 1 Q G 
Replicase 1b 



XATGCTGCT 



CTCCTGGT AGTGGTAAGTCACATTGTTCCATTGGACTTGGATTGTACTATCCAGGT6CGCGTATTGTTTTTGTTGCTTGTGCC 
GAGGACCATCACCATTCAGTGTAACAAGGTAACCTGAACCTAACATGATAG6TCCACGCGCATAACAAAAACAACGAACACGGGTACGACGA 



PPGSG 'KSHCSl GLG L Y Y PGAR I VFVACAHAA 
-Replicase 1b 



GTTGATT CCTTATGTGCAAAAGCTATGACTGTTTATAGCATTGATAAGTGTACTAGGATTATACCTGCAAGAGCTCGGGTTGAGTGTTATAG 
CAACTAAGGAATACACGTTTTCGATACTGACAAATATCGTAACTATTCACATGATCCTAATATGGACGTTCTCGAGCCCAACTCACAATATC 

VDSLCAKAMTVYS I DK C T R 1 IPARARVECYS 
. — Replicase 1b — 



TGGCT TTAAACCAAATAACACTAGTGCACAATACATATTTAGCACTGTTAACGCATTACCTGAGTGTAATGCTGATATTGTTGTTGTAGATG 
ACCGAAATTTGGTTTATTGTGATCACGTGTTATGTATAAATCGTGACAATTGCGTAATGGACTCACATTACGACTATAACAACAACATCTAC 

GFKPNNTSAQY 1FSTVNALP ECNADIVVVD 
' Replicase 1b ■ 



AAGTTT CAATGTGTACAAATTATGACCTTTCTGTTATTAATCAGCGTTTATCATATAAACATATTGTTTATGTTGGTGATCCACAACAACTT 
TTCAAAGTTACACATGTTTAATACTGGAAAGACAATAATTAGTCGCAAATAGTATATTTGTATAACAAATACAACCACTAGGTGTTGTTGAA 

EVSMCTNYDLSVINQRL SYKH1VYVGDPQ0L 
. Replicase 1b : 



u i -ecemoer -UU3 1 1 :52 1 / £5 1 
EMCR-CoV.MPD (1 > 27532) Site and Sequence ___Z 



CCTGCACCTAGAGTAATGATTACTAAAGGTGTTATGGAQCCTGTTGATTATAACGTTGTTACTCAACGTATGTGTGCTATAGGCC CTGATC 
GGACGTGGATCTCATTACTAATGATTTCCACAATACCTCGGACAACTAATATTGCAACAATGAGTTGCATACAiACGAiAicCGGGAclAC 
PAPRVM I TKGVME Y_ N V V T 0 , H C A 1 . P „ 



TTTTCTTCATAAATGTTATAGATGTCCTGCTGAAATAGTTAATACAGTTTCTGAACTTGTTTATGAGAACAAGTTTGTCCC TGTTAAACCT 

aaaagaagtatttacaatatctacaggacgactttatcaattatgtcaaagacttgaacaaatactcttgtIcaaacagggacaatttgga 
F L " K C Y R C P A E 1 V Replicase 

ctagtaaacagtgttttaaaatcttttttaagggtaatgtacaggttgacaatggctctagtattaacagaaagcagcttgaa atagttaa 
gatcatttgtcacaaaattttagaaaaaattcccattacatgtccaactgttaccgagatcataattgtctttcgtcgaactttatcaatt 

ASKQCFK I FFKGN V Q V D N GSSINRKQLEIVK 
— Replicase 1b — v _ 

CTGTTTTTAGTTAAAAATCCAAGTTGGAGTAAGGCTGTGTTTATTTCTCCTTATAATAGTCAGAATTATGTTGCTAGT AGATTTTTAGGAC 
GACAAAAATCAATTTTTAGGTTCAACCTCATTCCGACACAAATAAAGAGGAATATTATCAGTCTTAATACAACGATCATCTAAAAATCCTG. 



LFLVKNPSW SKAVFISPYNSQNYVASRFLP 

1 " Replicase 1b— — — S R F L G 



TCAAATTCAAACTGTTGATTCTTCTCAAGGTAGTGAGTATGATTATGTAATCTATGCACAAACTTCTGACACTGCACATGCTTGCA 

AGTTTAAGTTTGACAACTAAGAAGAGTTCCATCACTCATACTAATACATTAGATACGTGTTTGAAGACTGTGACGTGTACGAAGGTTAC 

___i QTVDSSO'g'seY D Y V I Y A Q T S D T A H A C N V 
— Replicase 1b . 

ACCGTTTTMTGTTGCTATAACACGTGCTAAGAAGGGTATATTTTGTGTAATGTGTGATAAAACTTTGTTTGAT TCACTTAAGTTTTTTGAC 
TGGCAAAATTACAACGATATTGTGCACGATTCTTCCCATATAAAACACATTACACACTATTTTGAAACAAACTAAGTGAATTCAAAAAACTC 

NRFNVA I TRAKKG I F CV MCDKTLFDSLKFFE 
Replicase 1b— ___________ 

ATTAAACATGCAGATTTACACTCTAGCCAGGTTTGTGGCTTGTTTAAAAATTGTACACGCACTCCTCTTAATTTACCACCAACTCAT GCACA 

taatttgtacgtctaaatgtgagatcggtccaaacaccgaacaaatttttaacatgIgcgtgaggagaattaaatggtggttgagtacgtg! 

IKHADLHSS QVCG N C T R T P L N L P P T H A I 



■ - -y- - "7(; ; -■ "V j j , j* , l-flb ^ T ' ^aaataatgt-ttgtacttatga acatg'Tta 

gtgaaagaacagcaacagtctagtcaaattctgatgtccactaaatcgacaagtttatccaagtttattacaaacatgaatactIgtacaat 



TFLSLSDQFKTTG D LAVQl GSNNVCTYEHV 
— Replicase 1b ■ , _________ 



TA } CAT y{ ATG6GTTTTAGGT TTGATATTAGTAT TCCTGGTAGTCATAGTTTGT TTTGTACACGTGACTTTGCTATTCGTAATGTGC GTGGT 
ATAGTAAATACCCAAAATCCAAACTATAATCATAAGGACCATCAGTATCAAACAAAACATGTGCACTGAAACGATAAGCATTACACGCACCA 

I SFMGFRFO 1 S I PG S H S L F C T RDFA I RNVRq 

■ Replicase 1b- = _ 



01 December 2003 11:52 *"'"/ °f 
EMCR-CoV.MPD (1 > 275321 Site and Sequence 



TGGTTGGGTATGGATGTTGAAAGTGCTCATGTTTGTGGCGATAACATAG 1 
ACCAACCCATACCTACAACTTTCACGAGTACAAACACCGCTATTGTATCCATGATTACAAGGAAATGTCCAACCAAAAAGTTTACCACAATT 

W L G M D V E S A H V C G ^ T H V P L Q V C F 3 H Q V H 

TTTTGTTGTGCAAACTGAAGGTTGTGTGTCTACCAATTTTGGTGAT^ j 
AAAACAACACGTTlGACTicCAA^CACAGATGGTTAAAACCAiTACAATAATTTGGACAAACACGTTTTAGAGGTGGTCCACTTGTTAAAT 

FVVQT EGCVSTni ^ 



VVQTEGCVSTNFGDV IKPVCAKSPPGE C l_F_ 



GACACCTTGTTCCTTTTTTACGTAAAGGACAACCTTGGTTAATTGTTCGTAGACGCATTGTGCAAATGATATCTGATTATTTGTCCAATTTG ^ 

cIgtggaacaaggaaaaaatgcatttcctgtIggaaccaatIaacaagcatctgcgtaacacgtttactatagactaataaacaggttaaac 

RHLVPFLRKGQPWLIVRRR1VQMISDYLS N L 
RHLV PFLK Replfcase 1b- 



TCT GACATTCTTGTCTTTGTTTTGTGGGCAGGTAGTTTGGAATTAACTACAATGCGTTACTTTGTAAAAATAGGGCCAATTAAATATTGT^ 
iGAclGTAAGAACAGAAACAAAACACCCGlcCATCAAAciTTAAiTGATGTTACGCAATGAAACATTTTiATCCCGGTTAATTTATAACAAT 

SD | LVFVLWAGSL E^ 



TTGTGGTAATTCTGCCACTTGTTATAATTCAGTTAGTAATGAATATTGTTGTTTTAAACATGCATTGGGTTGTGATTATGTTTACAATCCGT 
AACACCATTAAGACGGTGAACAATATTAAGTCAATCATTACTTATAACAACAAAATTTGTACGTAACCCAACACTAATACAAATGTTAGGCA 

CGNSATCYNSVS .N E Y_ CCFKHALGCDYVY N_P_ 



-Replicase 1b- 

A TGCTTTTGATATACAACAGTGGGGTTATGTTGGTTCCTTGAGCCAGAACCACCACACGTTCTGTAACATTCATAGAAA 
TACGAAAACTATATGTTGTCACCCCAATACAACCAAGGAACTCGGTCTTGGTGGTGTGCAAGACATTGTAAGTATCTTTGCTCGTACTACGA 

Y A F D I Q Q W G Y V G S L S Q N H H T F C N I H R N E H D A 
' M r . . Replicase 1 b 

TCT GGTGATGCTGTTATGACACGTTGTTTGGCAGTACATGATTGTTTTGTCAAAAATGTTGATTGGACTGTAACGTACCCCTTTATTGCA 
AGACCACTACGACAATACTGTGCAACAAACCGTCATGTACTAACAAAACAGTTTTTACAACTAACCTGACATTGCATGGGGAAATAACGTTT 

S G D A V M T R C L A V H ^ K N V D W T V T Y P F 1 A N 

TGAGAAATTTAT CAATGGCTGTGGGCGTAATGTCCAGGGACATGTTGTTCGCGCAGCCTTGAAATTGTATAAACCTAGTGTTATTCATGATA 
ACTCTTTAAATAGTTACCGACACCCGCATTACAGGTCCCTGTACAACAAGCGCGTCGGAACTTTAACATATTTGGATCACAATAAGTACTAT 

E K F 1 N G C G R N V Q G H V V R AALKLYKPS VI HP 
11 Replicase 1b 



TTGGT AATCCTAAAGGTGTACGTTGTGCTGTTACTGATGCCAAATGGTACTGTTATGACAAGCAACCTGTTAATAGTAATGTCAAGTTGTTG 
AACCATTAGGATTTCCACATGCAACACGACAATGACTACGGTTTACCATGACAATACTGTTCGTTGGACAATTATCATTACAGTTCAACAAC 

, G N P K G V R C A V T D A K W V C Y D K 0 P V N S N V K L L 
' Replicase 1b ■ 



in uecemDer zuua 1 1 :oz 

EMCR-C0V.MPD (1 > 27532) Site and Sequence 



GATTATGATTATGC AACCCATGGTCAACTTGATGGTCTTTGTTTATTCTGGAATTGTAATGTTGATATGTATCCAGAA TTTTCAATTGTGT 

CTAATACTAATACGTTGGGTACCAGTTGAACTACCAGAAACAAATAAGACCTTAACATTACAACTATACATAGGTCTTAAAAGTTAACACA 

DYDYATHGQLDGLC L F W N. C N V D M Y P E F S I V 
■ -Replicase 1b — ■ — ■ , 

TCGCTTTGACACACGTACTCGTTCTGTTTTTAATTTAGAAG GTGTTAATGGTGGTTCTCTTTATGTTAACAAACATGCGTTTCATACAmA 

1 1 ' 1 ' I ' ' 1 1 ' 1 ' ' ' I ' ' 1 1 * ( 1 1 1 I 1 ' 1 ' I ' ■ ■ ■ \ ' » » > 1 I 1 I I ( I I I 1 | 1 i i i | 1 1 t , | | , t , I , , t , |„ , t _ 



AGCGAAACTGTGTGCATGAGCAAGACAAAAATTAAATCTTCCACAATTACCACCAAGAGAAATACAATTGTTTGTACGCAAAGTAT 



GTGGT 



R F 0 T R T RSVFNLEGVNGGSLYVNKHAFHTP 

— Replicase 1b— ' r 



CATATGATAAACGTGCTTTTGTTAAA TTAAAACCTATGCCCTTTTTTTACTTTGATGACAGTGATTGTGATGTTGTGCAAGAACAAGTTAA 
' 11 1 ' ' ' ' 1 11 11 1 ' ' 11 1 11 11 1 11 11 1 ' ' 11 1 ' 11 1 1 1 1 11 1 1 1 1 1 1 ■ ■ 1 I • ■ 1 1 1 1 1 1 1 . 1 . . 

GTATACTATTTGCACGAAAACAATTTAATTTTGGATACGGGAAAAAAATGAAACTACTGTCACTAACACTACAACACGTTCTTGTTCAATT. 

AYDKRAFVKLKP M PFFYFDDSDCDVVQEQVN 
— — — Replicase 1b— 

TATGTACCCCTTCG CGCTAGTAGTTGTGTTACCCGTTGTAATATAGGTGGTGCTGTTTGTTCAAAACATGCAAATTTGTATCAAAAATATR- 

' 1 I . 1 1 1 I ■ ■ ■ . I ■ ■ ■ ■ I 1 1 ■ 1 I I . I . , , ■ ■ ■ ■ I ■ 1 , I 1 1 , , I , , , , ! , , 

ATACATGGGGAAGCGCGATCATCAACACAATGGGCAACATTATATCCACCACGACAAACAAGTTTTGTACGTTTAAACATAGTTTTTATACi 

Y V P L R ASSCVTRCNI GGAVCSKHANLYQKY 
Replicase 1b _ 



TGAGGCATATAATACATTTACACAGGCTGGT TTTAACATTTGGGTACCACATAGTTTTGATGTTTATAATTTGTGGCAAATTTTTATTGAA/ 
1 1 1 ■ ■ ' ■ 1 ■ 1 ■ 1 1 . 1 ■ ■ 1 1 1 1 ■ , 1 , , [ _ t _ 

ACTCCGTATATTATGTAAATGTGTCCGACCAAAATTGTAAACCCATGGTGTATCAAAACTACAAATATTAAACACCGTTTAAAAATAACTT". 

E A YNT FTQAGFN IW VPHSFDVYNLWQ I F I F 
Replicase 1b : 11 



c y^ ATTT t CAAA ? TCTT ? AAAA } ATAG ^ ATTT f ATGT { GTAAAAAAA ? GGTG T T i TA ^ 

GATTAAATGTTTCAGAACTTTTATATCGTAAATTACAACATTTTTTTCCCACAAAATGACCACAACTACCACTCAATGGACAACGTCAACA/ 

T N L Q S L E NIAFNVVKKGCFTGVDGELPVAVV 
' Replicase 1b' — — — 

AACGACAAAGTTTTTGTTCGCTATGGCGATGTTG ACAACTTGGTTTTTACAAATAAAACAACATTGCCTACTAATGTTGCTTTTGAATTGTT 

' ' " " " ' '" ' "" ■ » i > ■ — • — ■ ■ i 

CTTAACAA 



TTGCTGTTTCAAAAACAAGCGATACCGCTACAACTGTTGAACCAAAAATGTTTATTTTGTTGTAACGGATGATTACAACGAAAA 1 



NDKVFVRYGDV DNLVFTNKTTLP TNVAFELf 
■ —Replicase 1b — — fl h E L f 



I 11 1 1 i 1 i t i | 1 1 1 i i 1 1 i 1 | i I , t 1 tit 1 |~ i ~i~i~ t ~| , 1 7 f | ] \ 1 1 i T I V7J I T I I y ; — — — j j i ^ ' i ' ATGGGATTATG 
ACGTTTTGCTTTTTACCCAAATTGTGGTGGTAACAGATAAGAGTTTTTAGAACCACAACAACGATGTATATTTAAACAAAATACCCTAATAC 

AKRKMGLT PPLS I LKNLGVVATYKFVL WDY 
~ — Replicase 1b . 



T AC T AAGt At3 TG T ATOT A A A~TA CACXGATTT ~I~A A T C3 AG GATGTT T CSTG TT T G T~ TTTTG AC A/VT A G T A T T 
TTCGACTTTCTGGAAAATGGAGTATATGATTCTCACATACATTTATGTGACTAAAATTACTCCTACAAACACAAACAAAACTGTTATCATAA 

EAERPFTS YTKSVCKYTDFNEDVCVCFDNSr 
— Replicase 1b — _ 5 1 



01 December 2003 1 1 :52 — 1 1, «j i P€ 
EMCR-CoV.MPD (1 > 27532) Site and Sequence 

CAGGGTTCGTATGAGCGTTTTACGCTTACTACGAACGCTGTTTTATTTTCTACTGTTGTCATTAAAAATTTAACACCTATAAAGTTGAATTT 
■ i i ■ i i i i i I i < i i ' 1 I ■ ■ ■ i » ■ ■ ■ I ■ » ■ ■ i ■ ■ ■ ■ I ■ ■ ■ ' ■ ' ' ' ■ I ' " ' i " » ' 1 1 ' ' ■ ■ ' ■ ■ ■ 1 ■ ' ' ' i ' ' ■ 1 1 ■ ■ ' ' ■ ' ■ ' ■ I ■ ■ 1 
GTCCCAAGCATACTCGCAAAATGCGAATGATGCTTGCGACAAAATAAAAGATGACAACAGTAATTTTTAAATTGTGGATATTTCAACTTAAA 

QGSYERFTLTTNAVLFSTVVIKNLTP I KLNF 
. — Repllcase 1b 



TGGTATGTTGAATGGTATGCCAGTTTCTTCTATTAAGAGTGATAAAGGTGTTGAAAAATTAGTTAATTGGTACACATATGTTCGTAAAAATG 

i i i i i i i I *' i ' I 1 1 ■ ■ ■ ■ i 1 ■ » ■ l ■ ■ I ■ ■ ■ ■ i ' " ' 1 ■ ■ ■ ■ i ■ ■ ■ ■ I 1 ■ ■ ■ ' t 

ACCATACAACTTACCATACGGTCAAAGAAGATAATTCTCACTATTTCCACAACTTTTTAATCAATTAACCATGTGTATACAAGCATTTTTAC 

GMLNGMPVSS I KSDKGVEKLVNWYTY VRKN 
— — — — RepHcase 1b 



GTCAATTTCAAGATCATTATGATGGTTTTTACACTCAAGGTAGGAATTTATCAGACTTTACACCAAGAAGTGATATGGAGTATGATTTTCTT 

I . . . ■ i . . , , | . . ■ ■ i ■ ■ ■ ■ I ■ ■ » ■ i ■ ■ ■ ■ I ■ » » ■ i ■ i ■ ■ l ' ■ ' ' i ■ ■ ' ■ I I . ... | .... i .... | .... i . 1 

CAGTTAAAGTTCTAGTAATACTACCAAAAATGTGAGTTCCATCCTTAAATAGTCTGAAATGTGGTTCTTCACTATACCTCATACTAAAAGAA 

GQFQDHYDGFYTQGRNLSDFTPRSDMEYDFL 
■ Replicase 1b-- 



AACATGGATATGGGTGTTTTTATTAATAAATATGGTCTTGAGGATTTTAATTTTGAACATGTTGTATATGGTGATGTTTCAAAAACTACATT* 

, | , ■ » . t . ■ ■ . ) ■ ■ ■ ■ i > ■ ■ ■ I ■ ■ ■ ■ i ■ ■ ■ » I ■ ■ ■ » i ' ' ■ ' 1 1 ' 1 1 i ' ' ' ' I | . ... i . ... | . ... i .... i .... i .. ■ 1 

TTGTACCTATACCCACAAAAATAATTATTTATACCAGAACTCCTAAAATTAAAACTTGTACAACATATACCACTACAAAGTTTTTGATGTAA 

NfiDMGVFINKYGLEDFNFEHVVYGDVSKTTL, 
RepHcase 1b — — — 



AGGAGGTCTTCATTTGTTGATATCACAGTTTAGGCTTAGTAAAATGGGTGTTTTGAAAGCTGATGATTTTGTCACTGCTTCTGACACAACTT 

, ( , , i i ■ i ^ i i i i i ■ i i i i t i i i i i i .... i . . .. i .... | . ... . . i ... . 1 

TCCTCCAGAAGTAAACAACTATAGTGTCAAATCCGAATCATTTTACCCACAAAACTTTCGACTACTAAAACAGTGACGAAGACTGTGTTGAA 

GGLHLL I SQFRLSKMGVLKADDFVTASDTT 
_ . Replicase 1b 



TGAGGTGCTGTACTGTTACTTATCTTAATGAACTTAGTTCAAAAGTTGTTTGTACTTATATGGATTTGTTGTTGGACGACTTTGTTACTATA 

i .... I ... i i ■■■»]' ' I ■ > ■ ■ i ■ ■ » » 1 ■ » ■ ■ i » ■ ■ ' I ' ' ' ■ i ' ■ ' ' 1 ■ ' ' ' ' ' ' ■ ' I ' ' 1 ■ ' ■ 1 ■ ' 1 ' ' 1 1 ' ' ' 1 ' ' 

ACTCCACGACATGACAATGAATAGAATTACTTGAATCAAGTTTTCAACAAACATGAATATACCTAAACAACAACCTGCTGAAACAATGATAT 

LRCCTVTYLNELSSKVVCTYMDLLLDDFVT I 
— — ■ Replicase 1b ■ — — 



CTAAAGAGTTTAGATCTTGGTGTAATATCTAAAGTTCATGAAGTTATTATAGATAATAAACCTTATAGGTGGATGTTGTGGTGTAAAGATAA 

■ i | i i i i | ! , . , . | . i ■ . I t . i . | ■ . ■ ■ I ■ ■ . » | ■ ■ » | | i .» | .... I « ... | » ■■■ I ■»»» | »»■ » 

GATTTCTCAAATCTAGAACCACATTATAGATTTCAAGTACTTCAATAATATCTATTATTTGGAATATCCACCTACAACACCACATTTCTATT 

LKSLDLGV 1 SKVHEV I I DNKPYRWMLWCKDN 
— . — Replicase 1b — — 



CCACTTGTCGACTTTTTATCCACAGTTGCAGTCTGCTGAATGGAAGTGTGGTTATGCTATGCCACAAATTTATAAGCTTCAACGTATGTGTT 

■ i . , ■ . i ■ . . . i ■ . . ■ 1 1 > ■ ■ i » ■ ■ ■ i ■ ■ ■ ■ i ■ ■ ■ ' i ■ ■ ■ ■ i ■ ' 1 ' i ■ ■ » ■ » ■ ■ ■ ■ i ■ ■ i i .... i .... i i . 

GGTGAACAGCTGAAAAATAGGTGTCAACGTCAGACGACTTACCTTCACACCAATACGATACGGTGTTTAAATATTCGAAGTTGCATACACAA 

HLSTFYPQLQSAEWKCGYAMPQIYKLQRiMC 
- — — RepHcase 1 b — 



TGGAACCTTGTAATTTATATAATTATGGTGCTGGTATTAAGTTGCCTAGTGGTATAATGTTAAATGTTGTTAAATACACTCAGCTTTGTCAA 

, . . i , ■ . . i . ■ . ■ 1 ■ ■ ■ ■ t . ■ ■ ■ 1 i , . . . i . ■ ■ . | . . . . i . . . , | . , . . i . ■ ■ « ! ■ ■ > ■ i ■ ■ ■ « I ■ » ■ ■ i » ■ ■ ■ t " » ■ i ■ ■ ■ 

ACCTTGGAACATTAAATATATTAATACCACGACCATAATTCAACGGATCACCATATTACAATTTACAACAATTTATGTGAGTCGAAACAGTT 

LEPCNLYNYGAGI KLPSG1MLNVV K Y T Q L C Q 
— 'Replicase 1b 



TACCTAAATAGCACTACAATGTGCGTACCTCATAATATGCGTGTTTTGCACTATGGTGCTGGTTCTGACAAAGGTGTQGCA 
ATGGATTTATCGTGATGTTACACGCATGGAGTATTAlACGCACAAAACGTGATACCACGAciAAGACTGTTlcCACACCGTGGACCATGiT 
Y L NSTTMCVPHNMR VLH Y G A GSDKGVAPGT 

^ ^ ACAGG 
ACAAAATTTTGCAACCGATGGTGGACTACGTTATTAGTAACTATTACTATAGTTACTAATACAATCACTACGTCTAAAATCGTAATGTciA 



•Replicase 1b- 

TGTTTTAAAACGTTGGCTACCACCTGATGCAATAATCATTGATAATGATATCAATGATTATGTTAGTGATGCAGATTTTAG CATTACAGGT 



VLKRWLPPDAII! ^ N D I NDYVSOADFS I TG 

Replicase 1b — — h b ' ' G 

ATTGTGCTACTGTTTACCTTGAAGATAAGTTTGACTTACTTATTTCTGATATG TATGATGGTAGAATTAAATTTTGTGATGRTRAAA apkti 

' -- 1 1 1 _T* ' J ' ' ' 1 1 ' 1 ' 1 1 ' ' ' ' 1 ' ' 1 1 i ' 1 1 1 1 ' 1 1 1 1 1 1 1 1 1 ■ ■ 1 i • ■ ■ i i t 1 1 i . i 

ATCTTAATTTAAAACACTACCACTTTTGCAC 



TAACACGATGACAAATGGAACTTCTATTCAAACTGAATGAATAAAGACTATACATACTACC 



DCATVYLEDKFDL L ^J^ H Y D G R I-KFCDGENV 

TCTAAAGATGGTTTTTTTACTTATCTTAATGGTGTTATTAGAGAAAAATTAGCTATTGGTGGTAGTGTTGCCATTAAGATTAC AGAATATAC 
AGATTTCTACCAAAAAAATGAATAGAATTACCACAATAATCTCTTTTTAA^ 

3 K ° G F F T Y L N G V ! R , E K L A I G G S V A I K I T E Y • 

— — Replicase 1b 1 ^ 

TTGGAATAAGTATCTTTATGAATTAATACAAAGATTTGCTm 

AAc6TTATTCATAGAAATACTTAATTATGTTTCTAAACGAAAAACCTGAAACAAGACGiGCAGACAAT!ATGTAGGAGAAGTci.TCGAAAAG 

WNKY LYEL I Q R F A F WT LFC T..SVNTS S S E A F 

Replicase 1 b . _ 

TTATTGGTATTAATTATTTAGGTGACTTTATTCAAGGTCCTTTTATAGCTGGTAACACTGTTCATGCTAATTATATATTTTGGCG TAATTCT 
AATAACCATAATTAATAAATCCACTGAAATAAGTTCCAGGAAAATATCGACCATTGTGACAAGTACGATTAATATATAAAACC 

L 1 - 1 N Y L G D F 1 Q G P / I A G__NT V H A N Y I F W R N S 

— Replicase 1b . —— _______ 

ACTATTATGTCTTTGTCATACAATTCAGTTTTAGA^ 

TGATAATACAGAAACAGTATGTTAAGTCAAAATciAAATTCATTCAAAciTACATTTGTATTCCGGTGAiAACAACAATGTGAAiTTCTATC 
T ' " 3 L S Y N 3 V L ° L S Re K p,lca F se C * " K a T v v y T L K D s 

_ AGTGGTAI 

— — — * ' i - *— ~~*~^~ f" *"* t" '* | i i i i i"t i i * if | " i * i * j | — i' ~ i i* ~ t~ | * | "* | " j i i" "7 | — j— i"i | i i ~i " i |~ i " j i ~ i "| — i i ~ i |~i i i ~ $ "| \ — i — I I j "" ^* _ \* * * I r % | 

ACTACATTTACTATACCAAAACTCAAACTAATTCTCACCATCCAACAACAATGCATTATCACCGGCAAAACCACCAAAATCATTAGTAAATC 



01 December 2003 11:52 ~ 9 w 1 

EMCR-CoV,MPD (1 > 27532) Site and Sequence 

TCTCAACTAAATGAA ACTTTTCTTGATTTTGCTTATTTTGCCCCTGGTTTCTTGCTTTTCTACA 

AGAGTTGATTTACTTTGAAAAGAACTAAAACGAATAAAACGGGGACCAAAGAACGAAAAGATGTACATTGTCATTACGATCATAAAGATACA 

MKLFL1LLILPLVSCFST C NSNASISM 
1 I Spike — — " 

V S T K . * 
— Replicase 1b — 1 

TAC AATTAGGTGTTCCTGATAACTCTTCAACTATTGTCACAGGTTTGTTGCCAGTCCATTGGAT ; 
ATGTTAATCCACAAGGACTATTGAGAAGTTGATAACAGTGTCCAAACAACGGTCAG 

LQLGVPONSST I V T g L LPVHW1CANQSTSSY 
CCAGCCAACGGCTTTTTCTATATTGATGTTGGTAAA^ 

GGTCGGTTGCCGAAAAAGATATAACTACAACCATTTGTGGCATCACGGAAACGTGAGGTATCACCAATAATACTACGATTGGTCATAATATA 
N G F F Y I QVGKHRSAFALHSGYYDANO Y Y^ 



p A IN » r 1 : '—j: — : — ~ - spike 



TTATCTCACTAATAAAA^ z 
AATAGAGTGATTATTTTATGTAAATTTACGAGGACAGTGAGACTTCTAAACATTCAAACCTTTGTGAAGAAAACTAAAAAATTCATTACAAA 



YLTNK IHLNAPVTL C K F G N T 3 F D F L 3 N V 



CTACTTCTCATGATTGTATAGTTAATTTGTCATTCACAGAACAGTTAGGTGTGCCTTTGGGCATAACTATATCGGGTGAAACTGTACGTTTG ^ 

gatgaagagtactaacaIatcaattaaacagtaagtgtcttgtcaatccacacggaaacccgtattgatatagcccactttgacatgcaaac ' 

STS'HDC I VNLSFTEQ .L^ Q V P L G 1 T I S G E T V R L 

CATTTATATAAT GCAACTCGTACTTTTTATGTGCCGGCCGCTTATAAACTTACTAAACTTAGTGTTAAATGTTACTTTAGTGAATCCTGTGT ^ 
GTAAATATATTACGTTGAGCATGAAAAATACACGGCCGGCGAATATTTGAATGATTTGAATCACAATTTACAATGAAATCACTTAGGACACA 

H L Y N A T R T F Y V P A A Y .JC. L T K L S V K C Y F S E S C V 

TTTTAGTGTTGTCAATGCCACCATTACTGTTAATGTCACCACACTTAATGGCCGTATAGTTAACTACACTGTTTGTGATGATTGTAATGGTT ^ 
AAAATCACAACAGTTACGGTGGTAATGACAATTACAGTGGTGTGAATTACCGGCATATCAATTGATGTGACAAACACTACTAACATTACCAA 

FSVVNAT I TV N V T T L N GRIVNYTVCDDCNG 

— " * m " " SpiKe ' 



ATACTGAT AACATATTTTCTGTTCAACAGGATGGCCGCATTCCTAATGGTTTCCCTTTTAATAATTGGTTTTTGTTAACTAATGGTTCCACA 
TATGACTATTGTATAAAAGACAAGTTGTCCTACCGGCGTAAGGATTACCAAAGGGAAAATTATTAACCAAAAACAATTGATTACCAAGGTGT 

YTDN1FSVQQD GRIP N GFPFNNWFLLTNGST 

1 m 1 " " " ' Spike 



EMCR-CoV.MPD (1 > 27532) Site and Sequence 



TTAGTGQACGGGGTCTCTAGACTTTA TCAACCACTCCGTTTAACTTGTTTATGGCCTGTACCTGQTCTTAAATCTTCAACTGGTTTTRTT- 
' ' ' ' ' ' ' ' ' 1 ' ' ' ' ' ' ' ' ' 1 ' ' ' ' 1 ' ' ' ' ' I i i ' i i ■ i ■ ■ I . . i . i . ■ i ■ i | . . , . , ■ [ ■■■., , , 

AATCACCTGCCCCAGAGATCTGAAATAGTTGGTGAGGCAAATTGAACAAATACCGGACATGGACCAGAATTTAGAAGTTGACCAAAACAA/ 

LVDGVSRL Y QPLRLTCLWPVPGLKSSTGFV 
Spike — 



TTTTAATGCCACTGGTTCTGATGTTAATTGTAACGGCTATCAACATAATTCTGTTGCTGATGTTATGCGTTACAATCTTAACCTCAGTGCT 
| i ... i ... i | .... i | , , , ■ i , ■ , , | , , , | ,,, | f , , , , .[..., . ■ , , , 

AAAATTACGGTGACCAAGACTACAATTAACATTGCCGATAGTTGTATTAAGACAACGACTACAATACGCAATGTTAGAATTGGAGTCACG/i 

FN ATGSDVNCNGYQHNSVADVMRYNLNLSA 
1 ■ — — — Spike ■ 

ATTCTGTGGACAATCTTAAGAGTGGTGTTATAGTTTTTAAAACTTTACAGTACGATGTTTTGTTTTATTGTAGTAATTCTTCTTCAGGTGT 
' 11 11 1 11 11 ' 1 1 ■ ■ ■ I ■ ■ . i i ■ ■ ■ ■ i i ■ ■ ■ i ■ ■ . , ; , , i ... | ■ i , , , | , i ,, [ , ,,[ ,,, , 

TAAGACACCTGTTAGAATTCTCACCACAATATCAAAAATTTTGAAATGTCATGCTACAAAACAAAATAACATCATTAAGAAGAAGTCCACA 

NSVDNLKSGV I VFKTLQYDVLFYCSNSSSGV 
" ■ Spike — — — 

CTTGACACCACAATACCTTTTGGCCCTTCCTCTCAACCTTATTACTGTTTTATAAACAGTACTATCAACACTACTCATGTTAGCACTTTTG 
' ' ' 1 ' ' ' ' ' ' ' ' ' ' ' ' 1 ' 1 1 ' ' ' I ' ' 1 ' 1 1 1 ' 1 I ' ' ■ ' ' 1 1 ' ' 1 I ■ ' ■ ' i ■ ' ' ■ I i ■ ■ i i I.. . | i ■ 1 | | , 

GAACTGTGGTGTTATGGAAAACCGGGAAGGAGAGTTGGAATAATGACAAAATATTTGTCATGATAGTTGTGATGAGTAGAATCGTGAAAAC. 

L D T T I P FGPSSQPYYCFINSTI NTTHVSTF 
— — — — — — Spike — . 



GGGTATTTTACCACCCACTGTGCGTGAAATTGTTGTTGCTAGAACTGGTCAGTTTTATATTAATGGTTTTAAGTATTTCGATTTGGGTTTC> 
' 1 | .... i .... | i ■ ... i .. . . i , .. .. i ! | ■ ■ ■ i - | . . 

CCCATAAAATGGTGGGTGACACGCACTTTAACAACAACGATCTTGACCAGTCAAAATATAATTACCAAAATTCATAAAGCTAAACCCAAAG" 

S I L P P T VRE I VV.ART GQFY I NGFKYFDL GF 

1 — ■ Spike • ' 



TAGAAGCTGTCAATTTT AATGTCACGACTGCTAGTGCCACAGATTTTTGGACGGTTGCATTTGCTACTTTTGTTGATGTTTTGGTTAATGT1 
' ' ' ' 1 1 ' ' ' ' 1 ' ' ' ' ' ' ' ' ' I ' ■ ■ ' i i . ■ ■ | ■ ■ ■ ■ i - ■ ■ . i | | , | 

ATCTTCGACAGTTAAAATTACAGTGCTGACGATCACGGTGTCTAAAAACCTGCCAACGTAAACGATGAAAACAACTACAAAACCAATTACA/ 

I E A V N F N V T T A S A T D F WTV A F A T F V D V L V N V 
— — Spike — . — , 



AGTGCAACTAACATTCAAAACTTACTTTATTGCGATTCTCCATTTGAAAAGTTGCAGTGTGAGCACTTGCAGTTTGGATTGCAAGATGGTTT 

" " 1 " ' | | | | [ | ■ i ■ . | | i i i i | i i i i I , i ,_ 

TCACGTTGATTGTAAGTTTTGAATGAAATAACGCTAAGAGGTAAACTTTTCAACGTCACACTCGTGAACGTCAAACCTAACGTTCTACCAAA 



SATN 'QNL L YCDSPFEKLQCE HLQFGLQDGF 
Spike — . 



■ ,- , , , - , -i-r-f- . ■ -i---T | - . -;-rr-r . r i Yy; , - i . -, ■ . - , 7j""" , , r: , \ \ , , y , ' , r , ," , _ "| , , , , , , , , , , , 

AATAAGACGTTTAAAAGAACTACTATTACAAAACGGACTCTGAATACAACGTGAGGGGTAAATAATAGTTGTATGCCTGTATTTAAAATGAC 

Y S A N F L ODNVLPETYVALP I YYQHTD I NFT 

' ~ Spike 



CAACTGCATCTTTTGGTGGTTCTTGTTATGTTTGTAAACCACGCCAGGTTAATATATCTCTTAATGGTAACACTTCAGTGTGTGTTAGAACA 
" ' 1 1 ' ' ' ' 1 1 ' 1 ' ' ' 1 ' ' 1 ' ' 1 1 1 I | . ■ ■ , i ■ i , ■ | , ■ ■ , i ■ ■ ■ ■ | . i i i ■ i i i | 

GTTGACGTAGAAAACCACCAAGAACAATACAAACATTTGGTGCGGTCCAATTATATAGAGAATTACCATTGTGAAGTCACACACAATCTTGT 

ATASFGGSC YVCKPRQVNISLNGNTSVCVRT 

— Spike ■ — — — . 



01 December 2003 11:52 ±&£ CS f Pa, 
EMCR-CoV.MPD (1 > 27532) Site and Sequence . 

TCTCATTTTTCAATTAGGTATATTTATAACCGCGTTAAGAGTGGTTCACCAGGTGACTCTTCATGGCATATTTATTTAAAGAGTGGCACTTG 

. I , , i , | i ■■)■■■■'■■■' | | ■ ... i ■ ... | I ■ ... i i ... I I 22 

AGAGTAAAAAGTTAATCCATATAAATATTGGCGCAATTCTCACCAAGTGGTCCACTGAGAAGTACCGTATAAATAAATTTCTCACCGTGAAC 

cupSIRYIYNRVKSGSPGOSSWHIYL K S G T C 
Spike 



Tr.r.ATTTTCTTTTTCTAAGTTAAATAATTTTCAAAAGTTTAAGACTATTTGTTTCTCAACCGTCGAAGTGCCTGGT AGTTGTAATTTTCCAC ■ 

_ i i i | - i ■ ■ | -~H I i i i i 1 i . i i I | , , , , 1 , , i , | ■ ■ ■ i I ■ ■ i i | ■ i ■ ■ l ■ ■ ■ ' I I ' " 

AGGTAAAAGAAAAAGATTCAATTTATTAAAAGTTTTCAAATTCTGATAAACAAAGAGTTGGCAGCTTCACGGACCATCAACATTAAAAGGTG 

PFSFSKLNNFQK FKTI CFSTVEVPGSCNFP 
. Spike - — — ■ 



TTG AAGCCACCTGGCATTACACTTCTTATACTATTGTTGGTGCTTTGTATGTTACTTGGTCTGAAGGTAATTCCATTACTGGTGTACCTTAT ^ 
AACTTCGGTGGACCGTAATGTGAAGAATATGATAACAACCACGAAACATACAATGAACCAGACTTCCATTAAGGTAATGACCACATGGAATA 

LEATWHYTSYTI VGA L YVTWSEGNSI TGVPY 

■ Spik© — — — — — ^ ~" 

CCTGTC TCTGGTATTCGTGAGTTTAGTAATTTAGTTTTAAATAATTGTACCAAATATAATATTTATGATTATGTTGGTACTGGAATTATACG 
GGACAGAGACCATAAGCACTCAAATCATTAAATCAAAATTTATTAACATGGTTTATATTATAAATACTAATACAACCATGACCTTAATATGC 

PVSGIREFSNLVLNNCTKYNI Y D Y V G T G 1 I R 
, . Spike " 



TTCTTCAA ACCAGTCACTTGCTGGTGGTATTACATATGTTTCTAACTCTGGTAATTTACTTGGTTTTAAAAATGTTTCCACTGGTAACATTT ^ 
AAGAAGTTTGGTCAGTGAACGACCACCATAATGTATACAAAGATTGAGACCATTAAATGAACCAAAATTTTTACAAAGGTGACCATTGTAAA 

SSNQSLAGG i TYVSNSGNL L.GFKNVSTGN I 
Spike — " 



TTATT GTGACACCATGTAACCAACCAGATCAAGTAGCTGTTTATCAACAAAGCATTATTGGTGCCATGACCGCTGTTAATGAGTCTAGATAT ^ 
AATAACA'CTGTGGTACATTGGTTGGTCTAGTTCATCGACAAATAGTTGTTTCGTAATAACCACGGTACTGGCGACAATTACTCAGATCTATA 



F ,VTPCNQPDGVAVYQQSI IGA MTA V N E S R Y 
, — — — Spike — 



RRrTTGCAAAACTTACTACAGTTACCTAACTTTTATTATGTTAGTAATGGTGGTAACAATTGCACTACGGCTGTTATGA TTTATTCTAATTT _ 

| | i . ■ . ■ ■ ■ ■ . I ■ i ■ ■ I ■ ' > < I ■ ' ■ ' I ' ' ■ ■ I ' ' ' ' ' ' 1 ' 1 I 1 1 1 1 ' 1 ' ' ' ' ' ' ' 1 1 ' ' ' ' 1 ' ' ' ' 1 ' ' ' ' 1 ' ' < 

CCGAACGTTTTGAATGATGTCAATGGATTGAAAATAATACAATCATTACCACCATTGTTAACGTGATGCCGACAATACTAAATAAGATTAAA 

QLQNLLQLPNFYYVSNG GNNCTTAVM 1 YSNF 
. ^ Spike — — 



T6GTATTT GTGCTGATGGTTCTTTAAT TCCTGTTCGTCCG^ 

ACCATAAACACGACTACCAAGAAATTAAGGACAAGCAGGCGCATTAAGATCACTATTACCATAAAGTCGGTATTAGTGACGATTAAATAGGT 

g J c A D G S L IPVRPRNSSDNGISA 1 j T A " N L S 
— Spike " — 1 1 



TTCCC TCTAACTGGACTACJTCAGTTCAAGTTGAGTACCTCCAAATT 

AAGGGAGATTGACCTGATGAAGTCAAGTTCAACTCATGGAGGTTTAAT6ATCATGAGGTTATCAACAACTAACACGATGAATACACACATTA 

I psNWTTSVQVEYLQ I TST P IVVDCATYVCN 
___ — — Spike — — — 



u i uewmiDer ZUU3 n:tXi ~4M ( y i 
EMCR-CoV.MPD (1 > 27532) Site and Sequence ' 



GGTAACCCTCGTTGTAAGAATCTACTTAAGCAGTATACTTCTGCTTGTAAAACTATTGAAGATGCCTTACGACTTAQTGCTCATTTG 

CCATTGGGAGCAACATTCTTAGATGAATTCGTCATATGAAGACGAACATTTTGATAACTTCTACGGAATGCTGAATCACGAGTAAACCTTT 

GNPRCKNLLKQYTSA _ C K T I E D A L R L S A H L E 

■ ■ — - — Spike ll 

TAATGATGTTAGTAGTATGCTAACTTTCGATAGCAATGCTTjTAGTTTGGCTAATGTTACTAGTTTTGGAGATTATAACCTTTC TAGTGTT 
ATTACTACAATCATCATACGATTGAAAGCTATCGTTACGAAAATCAAACCGATTACAATGATCAAAACCTCTAATATTGGAAAGATCACAA 

NDVSSMLTFDSNAF S LANVTSFGDYNLSSV 
—————————————————— _____ Spjke ■ 



TACCTCAGAGAAACATTCATTCAAGCCGTATAGCAGGACGTAGTGCTTTGGAAGATTTGTTGTTTAGCAAAGTTGTTA CATCTGGTTTGGG 
ATGGAGTCTCTTTGTAAGTAAGTTCGGCATATCGTCCTGCATCACGAAACCTTCTAAACAACAAATCGTTTCAACAATGTAGACCAAACCC. 



LPQRN 'HSSR1AGRS _A^L EDLLFSKVVTSGLG 

ACTGTTGATGTTGACTATAAGTCTTGTACTAAAGGTCTTTCTATTGCTGACCTTGCTTGTGCTCAGTACTACAATGGCATAATGGT TTTGC( 
TGACAACTACAACTGATATTCAGAACATGATTTCCAGAAAGATAACGACTGGAACGAACACGAGTCATGATGTTACCGTATTACCAAAACGt 
T V D V D Y K SCTKGLSIADLACAQYYNG I M V L 

' • Spike 

AGGTGTTGCTGATGCTGAACGTATGGCCATGTACACAGGTTCTCTTATAGGTGGCATGGTGCTCGGAGGTCTTACATCAGCAGCCGCC ATAC 
TCCACAACGACTACGACTTGCATACCGGTACATGTGTCCAAGAGAATATCCACCGTACCACGAGCCTCCAGAATGTAGTCGTCGGCGGTATG 

GVADAERM AMYT GSL. I GG MV LGGLTSAAA I 
Spike : fl 1 



CTTTTTCTTTGGCACTGCAAGCACGACTTAACTATGTTGCTTTACAAACTGATGTGCTTCAAGAAAATCAGAAAATTTTGGCTGCAT CATTT 
GAAAAAGAAACCGTGACGTTCGTGCTGAATTGATACAACGAAATGTTTGACTACACGAAGTTCTTTTAGTCTTTTAAAACCGACGTAGTAAA 
PFSLALQARLNYVAL Q T D V LQENGK I LAASF 



AATAAGGCTATTAATAATATTGTTGCTTCTTTTAGTAGCGTTAATGATGCTATTACACATACTGCAGAGGCTATACATACTGT TACTATTGC 

TTATTCCGATAATTATTATAACAACGAAGAAAATCATCGCAATTACTACGATAATGTGTATGACGTCTCCGATATGTATGACAATGATAACG 

NKAI NN 1 VASFSSVN _ OAITHTAEAIHTVTI p 

Spike 

ACT I A . A I AA . G A.TTCAGG_ATGTJ,GT,TAAT,CA-ACAG 

TGAATTATTCTAAGTCCTACAACAATTAGTTGTCCCATCACGAGAATTGGTAGAGTGAAGTGTTAACTCTGTATTAAAAGTCCGGTAAAGAT 

LNKIQDVV NQQGSA LNHLTSQLRHNFQ AIS 
~~~~ ———————— Spike — — — — . 



ATTCAATTCATGCTATTTATGACCGGCTTGATTCAATTCAAGI^ 

TAAGTTAAGTACGATAAATACTGGCCGAACTAAGTTAAGTTCGGCTAGTTGTTCAACTGTCTAATTAATGACCTGCCGAACGTCGAAACTTA 

NSIHAIYDRLDS I QADQQVDRL I TGRLAALN 

' ~ Spike 



01 December 2003 1 1 :52 o © / « f R 
EMCR-CoV.MPD (1 > 275321 Site and Sequence __ 

GCATTTGTTTCCCAAGTTTTGAATAAATATACTGAAGTTCGTGGTTCCAGACGCTTAGCACAGCAGAAGATTAATGAATGTGTCAAGTCACA 

I | . i . i . i i < I i ■ ■ ' i ' ' ' 1 1 1 1 1 ' ' ' ' I ' 1 ' ' ' 1 ' 1 1 ' ' 1 1 1 ' ' 1 ' 1 I | .... i i , ; 

CGTAAACAAAGGGTTCAAAACTTATTTATATGACTTCAAGCACCAAGGTCTGCGAATCGTGTCGTCTTCTAATTACTTACACAGTTCAGTGT 

AFVSQVLNKYTEVRGSRRLAQQK I NECVKSQ 
■ Spike ■ 



ATCTAATAGATATGGTTTTTGTGGCAATGGCACTCACATCTTTTCAATCGTCAACTCAGCTCCAGATGGTTTGCTTTTTCTTCATACTGTTT 

. . i I i , - i ■ ' i i | i ■ i | i i . ■ i ■ ■ i ■ I I , , , , i ■ ■ i ■ 1 ■ ■ i ■ I ■ i ■ ■ I I — 5 

TAGATTATCTATACCAAAAACACCGTTACCGTGAGTGTAGAAAAGTTAGCAGTTGAGTCGAGGTCTACCAAACGAAAAAGAAGTATGACAAA 

3NRYBFCQN6THIF3.lv NSAPDGLLFLHTV 
— ■ — ■ — Spike ' 



TGCTGCCAACTGATTACAAGAATGTAAAGGCGTGGTCTGGTATCTGTGTTGATGGCATTTATGGCTATGTTCTGCGTCAACCTAACTTGGTT 

i I i i i i i i i i i | I ' 1 1 1 i 1 1 * 1 I 1 1 1 1 I 11 ' ' 1 ' ' I ' 1 1 ' 1 1 1 1 1 ' | ■ ... i ■ ... | ■ ... i i ■■■ l 

ACGACGGTTGACTAATGTTCTTACATTTCCGCACCAGACCATAGACACAACTACCGTAAATACCGATACAAGACGCAGTTGGATTGAACCAA 

IIPTDYKNVKAWSG I CVOG I Y GYVLRQPNLV 
Spike — ~ 

CTTTATTCTGATAATGGTGTCTTTCGTGTAACTTCCAGGGTCATGTTTCAACCTCGTTTACCTGTTTTGTCTGATTTTGTGCAAATATATAA 

i i i i i i i i i | i | 1 i -H— •— « | i . . . i ■ ■ ■ ■ I . ■ ■ ■ t ■ ■ " | , ■ ■ ■ i ■ ■ ■ ■ | ■ ■ 

GAAATAAGACTATTACCACAGAAAGCACATTGAAGGTCCCAGTACAAAGTTGGAGCAAATGGACAAAACAGACTAAAACACGTTTATATATT 

LYSONGVFRVTSRVMFQPRLPVLSDFVQ 1 Y N 
_____ ■ Spike 



TTGTAATGTTACTTTTGTTAACATATCTCGTGTCGAGTTACATACTGTCATACCTGACTACGTTGATGTTAATAAAACATTACAAGAGTTTG 

„ , | , , , ■ | ■ i , , i i , i . 1 . i i i i ■ . ■ ■ I ' i ■ ' i ' ■ ■ ■ I ' ' ' ' I 1 ' ' ' I 1 | .... i .... | .... i ... i | ■ i i ■ 

AACATTACAATGAAAACAATTGTATAGAGCACAGCTCAATGTATGACAGTATGGACTGATGCAACTACAATTATTTTGTAATGTTCTCAAAC 

CNVTFVN I SRVELHTV I PDY-VDV N K T L Q E F 
— Spike ■ 



CACAAAACTTACCAAAGTATGTTAAGCCTAATTTTGACTTGACTCCTTTTAATTTAACATATCTTAATTTGAGTTCTGAGTTGAAGCAACTC 

, .,,, | | i i i ■ i . ■ ■ ■ I I I | .... i .... | .... i ■ ■■■ | 1 ■ ■ — -t— 

GTGTTTTGAATGGTTTCATACAATTCGGATTAAAACTGAACTGAGGAAAATTAAATTGTATAGAATTAAACTCAAGACTCAACTTCGTTGAG 

AONLPKYVKPNFDLTPFNLTYLNLSSE L K Q L 
— Spike — — — 

GAAGCTAAAACTGCTAGTCTTTTCCAAACTACTGTTGAATTACAAGGTCTTATTGATCAGATTAACAGTACATATGTTGATTTGAAGTTGCT 

i i i I i i i i i i i i i I I I 1 1 1 1 * 11 11 I 11 11 * 11 11 I 11 1 1 1 1 1 1 ' ' 1 ' " ' I ' ■ ■■ i ■ ■ ■ ■ 1 

CTTCGATTTTGACGATCAGAAAAGGTTTGATGACAACTTAATGTTCCAGAATAACTAGTCTAATTGTCATGTATACAACTAAACTTCAACGA 

FAKTASLFQTTVELQGL 1DQ I NSTYVDLKLL 
— Spike ■ — ' 



TAATAGGTTTGAAAATTATATCAAATGGCCTTGGTGGGTTTGGCTCATTATTTCTGTTGTTTTTGTTGTATTGTTGAGTCTTCTTGTGTTTT 

.[ i i i i i i i i •[■■•■ i 1 1 I | \ i ■ ■ .. | ... i i ■ ... | | , ■ , ■ i , ■ ■ i | 

ATTATCCAAACTTTTAATATAGTTTACCGGAACCACCCAAACCGAGTAATAAAGACAACAAAAACAACATAACAACTCAGAAGAACACAAAA 

NRFENYIKWPWWVWLI ISVVFVV L L S L L V F 
_____ . — Spike — 1 ~ 



GTTGTCTTTCTACAGGTTGTTGTGGTTGTTGCAATTGTTTAACTTCATCAATGCGAGGCTGTTGTGATTGTGGTTCAACTAAACTTCCTTAT 
i ■ , i i i i . ■ ■ i . . ■ ■ | ■ ■ ■ ■ i ■ i ■ ■ I ■ ■ ■ ■ i ' ■ ■ ■ I I | i i ,, i .... | | , ... i .... | 



CAACAGAAAGATGTCCAACAACACCAACAACGTTAACAAATTGAAGTAGTTACGCTCCGACAACACTAACACCAAGTTGATTTGAAGGAATA 

CCLSTGCCGCCNCLTSSMRGCCD C GSTKLPY 
■ Spike — 



ui ueuemDer zuira n:&x o *( a -y 
EMCR-CoV.MPD (1 > 27532) Site and Sequence 



TATGAATTTGAAAAGGTCCACGTTCAATAATGCCTTTCGGTGGCCTATTTCAACT 



TACTC 



TTGAAAGTACTATTAATAAGAGTGTGGCTAA 



ATACTTAAACTTTTCCAGGTGCAAGTTATTACGGAAAGCCACCGGATAAAGTTGAATGAGAACTTTCATGATAATTATTC 



TCACACCGATT 



YEFEKVHVQ ., 
Spike J 



! M P F G G LFQLTLESTI NKSVAN 
1 — ORF 4ab- A - 

CTCAAATTACCACCTCATGATGTTACTGTCTTGCGTGACAATCTTAAACCTGTTACTACACTTAGTACTATCACTGCTTATTTGT TAGTTAI 
GAGTTTAATGGTGGAGTACTACAATGACAGAACGCACTGTTAGAATTTGGACAATGATGTGAATCATGATAGTGACGAATAAACAATCAATI 

l K L P P HDVTVLRDNLKPVTTLST I TAYLLV 
ORF 4ab ' L L v 

^"^*^~|^ TGT C ACT* TATTTTGC TTT ATTC A A A.CC T CTT AC T GC T AGAG G TCG C 6TTGC T TGTTTT B TTT T A A AAC T ATT<3iXC A C T ATC T B TCI 

AAACAAACAGTGAATAAAACGAAATAAGTTTGGAGAATGACGATCTCCAGCGCAACGAACAAAACAAAATTTTGATAACTGTGATAGACAG/ 

L F v T Y F ALFKPLTARGRVACFVLKLL TLSV 
■ ORF 4ab ' L S V 



ATGTGCCTTTATTGGTTCTT TTTGGTATGTATCTTGACAGTTTTATAATTTTTTTTCTACGCTGTTGTTTCGATTCATACATGTTGGCTATT 
" 1 1 I ■■■■ ] ■■■■ | .... i i ... | i i , , | 

TACACGGAAATAACCAAGAAAAACCATACATAGAACTGTCAAAATATTAAAAAAAAGATGCGACAACAAAGCTAAGTATGTACAACCGATAA 

Y V P L L V L F Q H Y L D 3 F _I. I FFLRCCFDSYMLA I 

ORF 4ab- — ■ — — — — — — 



ATGC ^ TATC T CTAA y AAAAATm } cATT ^ 

TACGGATAGAGATTATTTTTAAAAAGTAAACAAAACAAGTTACAATGATTTGATACGAAGCAAAGTCCGTTCACAACCATAGAACTTGTTAG 
"P1SN KNFSFVLFN V^T K LCFVSGKCWYLEQS 

AT y TTATGAAAA } CGTT } TGCTGCTAT j TM ^ 

TAAAATACTTTTAGCAAAACGACGATAAATACCACCACTGGTGATACAGCAAAATCCACCACTTTGATAATGAAAACAAAGAAAACTACTG 

FYENRFAA 1 YGGDH -Y_ VVLGGET I TFVSFDD 

■ ORF 4ab . 



TTTATGTTGCTATTAGAGGTTCTTGTGAAAAGAACCTACAACTTATGCGTAAGGTTGACTTGTATAATGGTGCTGTCATTTACA TTTTTGCC 

AAATACAACGATAATCTCCAAGAACACTTTTCTTGGATGTTGAATACGCATTCCAACTGAACATATTACCACGACAGTAAATGTAAAAACGG 

LYVAIRGSCEKN LQLMRKVDLYNGAV I Y I FA 

ORF 4ab ' F A 

^ TACTCCTCTCAAC TATA C G A AG AT" GTTC^C T T CB/VT TA A TTI3ATGACAATGiC~;CAT"TSTCCTC AA,TTCTA 
CTTCTCGGACAACAACCATATCAAATGAGGAGAGTTGATATGCTTCTACAAGGAAGCTAATTAACTACTGTTACCGTAACAGGAGTTAAGAT 



|N F LRLIDDNGIVLNS 
— E ■ _ 

EEPVVGIVYSSQLYEDVPS'IN 
■ ORF 4ab —I 



01 December 2003 1 1 :52 55 W « 1 p £ 
EMCR-CoV.MPD (1 > 27532) Site and Sequence 

TTTTATGGCTCCTTGTTATGATATTTTTCTTTGTGTTGGCAATGACCTTTATTAAACTGATTCAATTGTGTTTTACTTGTCATTATTTTTTT 
■ | i | ■ ... i .... | ■■■■ i > ■■■ | .... m I 1 » ... I .... 1 , ... i .... 1 t 2 

AAAATACCGAGGAACAATACTATAAAAAGAAACACAACCGTTACTGGAAATAATTTGACTAAGTTAACACAAAATGAACAGTAATAAAAAAA 
ILWLLVM I FFFVLAh ^ F I KL I QLCFTCHYFF 



AGTAGGACATTATATCAACCAGTTTATAAAATTTTTCTTGCTTACCAAGATTATATGCAAATAGCACCTGTTCCAGCTGAAGTACTAAATGT 

■ i ■ i l .. . . 1 . i i | i i i i t i i i i | .» i i I i i i i | | ■ . . » I ■ ■■■ | .... 1 .,., ( . . .. 1 ... . i .... i . ... 1 ■ ■ 2 

TCATCCTGTAATATAGTTGGTCAAATATTTTAAAAAGAACGAATGGTTCTAATATACGTTTATCGTGGACAAGGTCGACTTCATGATTTACA 
SRTLYQPVYK 1 FLAY^QDYMQ I APVPAEVLNV 



CTAAACTAAACGATGTCTAATAGTAGTGTGCCTCTTTCAGAGGTTTATGTCCATTTACGTAACTGGAACTTTAGTTGGAATTTAATTCTAAC 

1 ■ ■ , I .... 1 1 > » ■ ' I » » ■ ■ 1 ■ ■ ■ ■ 1 ■ ' ■ ■ I ■ ' I ■ ■ 1 1 « . .. 1 .... 1 .... 1 .... 1 ■ ■ ■ ■ I ■■■ ■ 

GATTTGATTTGCTACAGATTATCATCACACGGAGAAAGTCTCCAAATACAGGTAAATGCATTGACCTTGAAATCAACCTTAAATTAAGATTG 

MSNSSVPLSEVYVHLRNWNFSWNL 1 L T 
— E-J I ■ M 



AGTTTTTATAGTTGTGTTGCAGTATGGGCATTATAAGTATAGCAGACTTCTTTATGGTTTAAAGATGTCTGTTTTATGGTGTTTATGGCCAC 

,i 1 1 1 1 1 1 1 1 1 1 > 1 1 1 1 1 1 1 1 1 . ■ > » i . * * * 1 1 . > » i > > ■ ■ i . * « . i i .... i , ... i . ... i .... i .... i .... i .... i . i 

TCAAAAATATCAACACAACGTCATACCCGTAATATTCATATCGTCTGAAGAAATACCAAATTTCTACAGACAAAATACCACAAATACCGGTG 

VF1VVL0YGHYKYSRLLYGLKMSV L W C L W P 
. ■ M — 



TTGTTCTAGCTTTGTCTATTTTTGACTGTTTTGTCAATTTTAATGTGGACTGGGTCTTTTTTGGTTTTAGTATTCTTATGTCTATTATTACA 

i i i I i i i i ) i i i i | i i i i I i i i i | .. '' I '.' 1 1 1 1 1 1 I 1 1 1 1 1 I ' ■ ' ' ' ■ ' ' ■ 1 ' ' ' [ .... | . . . . 1 . . .. I . . ■ 

AACAAGATCGAAACAGATAAAAACTGACAAAACAGTTAAAATTACACCTGACCCAGAAAAAACCAAAATCATAAGAATACAGATAATAATGT 

LVLALS I FDCFVNFNVDWVFFGFS 1 L M S I . I T 
. — — — M — — — — — ^— ^— — _____ - 



CTTTGTTTATGGGTTATGTATTTTGTTAATAGTTTCAGACTTTGGCGCCGTGTTAAAACTTTTTGGGCTTTTAATCCTGAAACTAATGCAAT 

i I i ■ i i 1 i . . . [ i > i . I > > ' ■ | . ■ ' ' 1 » ' ' ' 1 ' ' ' ■ 1 i ' ■ ■ | 1 . ... 1 .... | ■ i .. 1 .... | | [ 

GAAACAAATACCCAATACATAAAACAATTATCAAAGTCTGAAACCGCGGCACAATTTTGAAAAACCCGAAAATTAGGACTTTGATTACGTTA 

LCLWVMYFVNSFRLWRRVKTFWAFNPETNA I 
— — — M : 



CATCTCTCTCCAGGTTTATGGACATAATTATTACTTACCGGTGATGGCTGCACCTACAGGTGTTACATTAACACTTCTTAGTGGTGTACTTC 

..... i .... 1 I 1 ■ ■ i i ■ ■ ■■ l .... i .... [ .... i .... | ■■■■ i .... I h~ 

GTAGAGAGAGGTCCAAATACCTGTATTAATAATGAATGGCCACTACCGACGTGGATGTCCACAATGTAATTGTGAAGAATCACCACATGAAG 

I SLQVYGHNYYLPVMAAPTGVTLTLL SGVL 
„ M 



TTGTTGATGGCCATAAGATTGCTACTCGTGTTCAAGTGGGTCAGTTGCCTAAATATGTAATAGTTGCTACACCTAGTACCACAATTGTTTGT 
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AACAACTACCGGTATTCTAACGATGAGCACAAGTTCACCCAGTCAACGGATTTATACATTATCAACGATGTGGATCATGGTGTTAACAAACA 

LVDGHK 1 ATRVQVGQLPKYV I VATPSTT I VC 
— M — 



GACCGTGTTGGTCGCTCTGTTAATGAAACAAGCCAGACTGGTTGGGCATTCTACGTCCGTGCTAAACATGGTGATTTTTCTGGTGTTGCCTC 
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CTGGCACAACCAGCGAGACAATTACTTTGTTCGGTCTGACCAACCCGTAAGATGCAGGCACGATTTGTACCACTAAAAAGACCACAACGGAG 

DRVGRSVNETSQTGWAFYVRAKHGDFSGVAS 
— M 



EMCR-C0V.MPD (1 > 27532) Site and Seq uent ' 

TCAGGAGGGTGTTTTQTCAGAAAGAGAGAAGTTGCTTCATTTAATCTAAACTAAACAAAATGGCTAGTGTAAATTGGQCCGATG ACAGAG 
AGTCCTCCCACAAAACAGTCTTTCTCTCTTCAACGAAGTAAATlAGATTTGATliGTTiTACCGATCACATTTAACCCGGCTAiTGTcicG 
QEGVLSER^EKLLHL I ., f » A 8 V N W A P 0 'r , 



GCTAGGAAGAAATTTCCTCCTCCTTCATTTTACATGCCTCTTTTGGTTAGTTCTGATAAGGCACCATATAGGGT CATTCCCAGGAA 
CGATCCTTCTTTAAAGGAGGAGGAAGTAAAATGTACGGAGAAAACCAATCAAGACTATTCCGTGGTATATCCCAGTAAGGGTCCTTAGAAC 



* i uaagactattccgtggtatatcccagt" 1- ~~ H ~~ * 

A R KKFPPPSFYMPLLV 



^ VSSDKAPYRV I PRNL 

ccctattggtaagggtaataaagatgagcagattqgttattggaatgttcaagagcgttggcgtatgcgcagggggcaacgtgttgatt 

GGGATAACCATTCCCATTATTTCTACTCGTCTAACCAATAACCTiACAAGTTCTiGCAAicGCAiACGCGTCCCCCGTTGCACAACTAAAC( 
P 1 6 K G N K 0 E Q j g V W N^V QERWRMRRGQRVDL 



CTCCTAAAGTTCATTTTTATTACCTAGGTACTGGACCTCATAAGGACCTTAAATTCAGACAACGTTCTGATGGTGTTGTTTGGGTTGCTAA G 
GAGGATTTCAAGTAAAAATAATGGATCCATGACCTGGAGTATicCTGGAATTiAAGTCTGTTGCAAGACTACCACAACAAACCCAACGAiic 



P P K V HFYYLGTGPHKDL 
— ■ N 



K F RQRSDGVVWV 



A K 



GAAGGTGCTAAAACTGTTAATACCAGTCTTGGTAATCGCAAACGTAA TCAGAAACCTTTGGAACCAAAGTTCTCTATTGCTTT ft rrTrr fl oA 

1 ' 1 ' 1 1 I ... i i i i i i | 

AAGAGATAACGAAACGGAGGTCT 



CTTCCACGATTTTGACAATTATGGTCAGAACCATTAGCGTTTGCATTAGTCTTTGGAAACCTTGGTTTC ' ' ' " 1 ' ' 1 



E G A K T VNTSLGNRKRN Q K P L -E PKFSIALPPE 



GCTCTCTGTTGTTGAGTTTGAGGATCGCTCTAATAACTCATCTCGTGCTAGCAGTCGTTCTTCAACTCGTAACAACTCACGAGACTCTTCTC 
CGAGAGACAACAACTCAAACTCCTAGCGAGATTATTGAGTAGAGCACGATCGTCAGCAAGAAGTTGAGCATTGTTGAGTGCTCTGAGAAGAG 
L S V VEFEORSNNSSRA 

: — N— 



S S RSSTRNNSRDS 
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GTAGTACTTCAAGACAACAGTCTCGCACTCGTTCTGATTCTAACCAGTCTTCTTCAGATCTTGTTGCTGCTGTTACTTTGGCTTTAAAGAAC 
CATCATGAAGTTCTGTTGTCAGAGCGTGAGCAAGACTAAGATTGGTiAGAAGAAGTCTAGAACAACGACGAiAATGAAACCGAAATTTCriG 

R S T S R D Q S R T R S D S N Q S S S D L V A A V T L A L K N 



aatccaaaactattggtcagcttcagtggatcaagaagaccatgaaggtgaggaItcttIggatIattcggagaaagagttgggIcccgact 

L G F ° N Q S K S P 5 S S G T S T P K K P N K P L S Q P R A D 

taagccttctcagttgaagaaacctcgttggaagcgtgttcctaccagagaggaaaatgttattcagtgctttggtcctcgtgatttt aatc 

ATTCGGAAGAGTCAACTTCTTTGGAGCAACCTicGCACAAGGATGGTiTCTCCTTTTACAATAAGTCACGAAACCAGGAGCACTAAAATTAG 



KPSQLKKPRWKRVPT 
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ACAATAT GQGGGATTCAGATCTTGTTCAGAATGGTGTTGATGCCAAGGGTTTTCCACAGCTTGCTGAATTGATTCCTAATCAGGCTGCGTTA ^ 
TGTTATACCCCCTAAGTCTAGAACAAGTCTTACCACAACTACGGTTCCCAAAAGGTGTCGAACGACTTAACTAAGGATTAGTCCGACGCAAT ' 

HNHGD3DLV0NGVDA K GFPQLAEL I PNQAAL 

TTCTTTGATAGTGA GGTTAGCACTGATGAAGTGGGTGATAATGTTCAGATTACCTACACCTACAAAATGCTTGTAGCTAAGGATAATAAGAA ^ 
AAGAAACTATCACTCCAATCGTGACTACTTCACCCACTATTACAAGTCTAATGGATGTGGATGTTTTACGAACATCGATTCCTATTATTCTT 

FFDSEVSTDEVGDNV QITYTYKMLVAKDNKN 

CCTTCCTAAGTTCAT TGAGCAGATTAGTGCTTTTACTAAACCCAGTTCTATCAAAGAAATGCAGTCACAATCATCTCATGTTGCTCAGAACA ^ 
GGAAGGATTCAAGTAACTCGTCTAATCACGAAAATGATTTGGGTCAAGATAGTTTCTTTACGTCAGTGTTAGTAGAGTACAACGAGTCTTGT 

L PKFIEQ1SAFTKP S IKEMQSQSSHVAQN 

CAGTACTTAA TGCTTCTATTCCAGAATCTAAACCATTGGCTGATGATGATTCAGCCATTATAGAAATTGTCAACGAGGTTTTGCATTAAATT ^ 
GTCATGAATTACGAAGATAAGGTCTTAGATTTGGTAACCGACTACTACTAAGTCGGTAATATCTTTAACAGTTGCTCCAAAACGTAATTTAA 
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GTTT TGTAATTCCAGTTGAATGTTTATTATTATTAGTTGCAACCCCATGCGTTTAGCGCATGATAAGGGTTTAGTCTTACACACAATGGTAG ^ 
CAAAACATTAAGGTCAACTTACAAATAATAATAATCAACGTTGGGGTACGCAAATCGCGTACTATTCCCAAATCAGAATGTGTGTTACCATC 



«3'UTR« 



GC CAGTGATAGTAAAGTGTAAGTAATTTGCTATCATATTAACATGTCTAGAGGAAAGTCAGAACTTTTTCTGTTTGTGTTGTTGGAGTACTT ^ 
CGGTCACTATCATTTCACATTCATTAAACGATAGTMAATTGTACAGATCTCCTTTCAGTCTTGAAAAAGACAAACACAACAACCTCATGAA 



■3'UTR- 



AAA GATCGCATAGGCGCGCCAACAATGGAAGAGCCAACAACATATCTAAAAATGTTTTGTCTGGTACTTGTTAATGATATTGTTTTTGATAT ^ 
TTTCTAGCGTATCCGCGCGGTTGTTACCTTCTCGGTTGTTGTATAGATTTTTACAAAACAGACCATGAAC.AATTACTATAACAAAAACTATA 



GGATACACAAAAAAAAAAAAAAAA 

., \ I ■ ■ ' ' i ' ' ' ' I ■ > 27532 

CCTATGTGTTTTTTTTTTTTTTTT 
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Figure 4 Alignments 



t'2 9 l' Untranslated re * ion (Genomic sequence) aligned with human aoronavi: 

5 

....I.... I ....I.... | ....i...., 

10 ™;c^c sis SSSSg 

.........i ....i...., ...... ... , , .........i 

EMCR5 ' CTR CCTCTCAACT AAACGAAATT TTT-CTAGTG CTGTCATTTG TTATr- rr» r-n-J^L 
15 229E5-UTR TTTCTCAACT AAACGAAATT TTTGCTATGG TGATGCTGGA 

....I.... I ....|.. ..| ,...|. ...| . ... |. ... I III, 
125 135 145 155 • I 

20 ESS.'ES ^ TCaAGTT tgtaa-actg gttaggcaag tgttgtattt tctgtottta 

ZU 229E5-OTR AATTGAAATT TCATTTGGGT TGCAACAGTT TGGAAGCAAG TGCTGTGTGT CCTA-GTCTA 
....|....| ....I.... | ....I. ...| .. | | | , 

25 ssss ssss diss sis sssss: diss 

....|....| . ... | .... | ....|....i | , 

245 255 265 275 '"ink" "" 

EMCR5 UTR CTGCTTATTG TGGAAGCAAC GTTCTGTCGT TGTGGAAACC SATOSOTfrT »»™ 
3 0 229E5-0TR CGTGTTTGTG TGGAAGCAAA GTTCTGTCTT tISa^C ScTA 
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b. Putative Orf la 

....I. ...I . ... | .... | . ... | .... | .......... , , 

15 25 35 45 55" 

229E MFYNQVT LAVASDSEIS GFGFAIPSVA VRAYSEAAAQ GFQACRFVAP 

S5 :::::::::: =SSR 5SSHS ESSE 3S5SS 
pp v ssss; SSS ss= ilES 

BOCOV MSKINKYGLE LHWAPEFPWM FBDAEEKLDN PSSSEVDIVC STtSpt? SSSSS^ 

"«™ FKHAPBFPBM LmnoS SSSS ££££££ 

SS oov =SSS SSSS SS SSSS =S 

S v SSSS SSSS 535S52 KK= ~s IS 

BARS COV NGTCGIiVELE KGVLPQLEQP YVFIKRSDA1 STNHGHKWE 

— x» — 1 — ils'" 1 -ih— 1 

229E r»r™^ VDKYMCGFDG KPVLPKNMWE FRDYFNDNTD S-IVIGGVTY QLAHDVI RKD 

0=43 c-» PKS S SSS^g vSSSSZ JSESSS 
5 S= SEE = SB §§£ 

SARS COV VPHVGETPIA YRNVHRKNG NKGAGGHSYG r^oVxlm SS 

I ....|.... 1 ....J.... I ....|. ...| ....|....| . ... 1 .... J 

S iSI I™ « ™ : :: = ==s 

TGEV IKSITYCS-T YEHTFLDGTA MKVARTPKI KKNWLSE PLATIYRETr 

s s= = s= =r : =ss £™ 
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EMCR 



MPVQSRKFIA PWVMYLRKCG EKGAYIKDYK RGGFEH VYNFKVED AYDLVHDEPK 

IPAYAKQWLQ PWSILLRKGG NKGSVTSGHF RRAVTMP VYDFNVED ACEEVHLNPK 

IHVSSMAMRR LVGEVTAKVM DALGSNLSAL FQIVKQ QIARIFQK ALAIFENVNE 

GALRELTREL NGGAVTRYVD NNFCGPDGYP LDCIKDFLAR AGKSMCTLSE QLDYIESKRG 

. ... I .... I I I ....I.... I I . ... I .... I 

245 255 265 275 285 295 

SPFITNGISL LDIIVKPVFF NAFVKCNCGS ENWSVGAWDG YLSSCCGTPA KKLCVVPGNV 
SPVMTNGSNI LEAFTKPVFI SALVQCTCGT KSWSVGDWTG FKSSCCNVIS NKLCWPGNV 
SPFVDNGSDA RSIIRRPVFL HAFVKCKCGS YHWTVGDWTS YVSTCCGFKC KPVLVASCSA 
SPFMGNGDCL SKCFDTLHFI AATLRCPCGS ESSGVGDWTG FKTACCGLSG KVKGVTLGDI 
GKFSKKAYAL IRGYRGVKPL LYVDQYGCDY TGSLADGLEA YADKTLQEMK ALFPTWSQEL 
GKFSKKAYAL IRGYRGVKPL LYVDQYGCDY TGGLADGLEA YADKTLQEMK ALFPIWSQEL 
GKYSRKAYAL LKGYRGVKSI LFLDQYGCDY TGRLAKGLED YGDCTLEEMK ELFPVWCDSL 
LPQRIAALKM AFAKCARSIT VWVERTLVV KEFAGTCLAS INGAVAKFFE ELPNGFMGSK 
VYCCRDHEHE IAWFTERSDK SYEHQTPFEI KS — AKKFDT FKGECPKFVF PLNSKVKVIQ 

....|....| I I ....|....| . ... I .... I 

305 315 325 335 345 355 

VPGDVIITST DAGCGVKYYA GLVVKHITNI TGV SLWRVT A VHS DGMFVAT SSYDALLHRN 
KPGDAVITTQ QAGAGIKYFC GMTLKFVANI EGVSVWRVIA LQSVDCFVAS STFVEEEHVN 
MPGSVWTRA GAGTGVKYYN NMFLRHVADI DGLAFWRILK VQSKDDLACS GKFLEHHEEG 
KPGDAWTSM SAGKGVKFFA NCVLQYAGDV EGVSIWKVIK T FT VDETVCT PGFEGELN — 

LFDVIVAWHV VRDP RY VMRLQSAATI R SVAYVA NPTEDLCDGS WIKEPVHVY 

PFDVTVAWHV VRDP RY VMRLQSASTI R SVAYVA NPTEDLCDGS WIKEPVHVY 

DNEWVAWHV DRDP RA VMRLQTLATI R SIGYVG QPTEDLVDGD WVREPAHLL 

IFTTLAFFKE AAVR -WENIPNAP RGTKGFEWG NAKGTQVWR GMRNDLTLLD 

PRVEKKKTEG MGRIRSVYP VASPQECNNM HLSTLMKCNH CDEVSWQTCD FLKATCEHCG 

....|.... I ....|.... I 1 I. t 1 ....|.... I ♦ I ..... I 

365 375 385 395 405 415 

SLDPFCFDVN TLLSNQLRLA FLGASVTEDV KFAASTGVID ISAGMFGLYD DILTNNKPWF 
RMDTFCFNVR NSVTDECRLA MLGAEMTSNV RRQVASGVID ISTGWFDVYD DIFAESKPWF 
FTDPCYFLND SSLATKLKFD ILSGKFSDEV KQAIIAGHW VGSALVDIVD DALG — QPWF 
— DFIKPESK SLVACSVKRA FITGDIDDAV HDCIITGKLD LSTNLFGNVG LLFKK-TPWF 
ADDSIILRQY NLVDIMSHFY MEADTWNAF YfeVALKDCGF VMQFGYIDCE QDSCDFKGWI 
ADDSIILRQH NLVDIMSCFY MEADAWNAF YGVDLKDCGF VMQFGYIDCE QDLCDFKGWV 
AANAIVKRLP RLVETMLYT- — DSSVTEFC YKTKLCDCGF ITQFGYVDCC GDAC DFRGWV 
QKADIPVEPE GWSAILDGHL CYVFRSGDRF YAAPLSGNFA LSDVHCCERV VCLSDGVTPE 
-TENLVIEGP TTCGYLPTNA VVKMPCPACQ DPEIGPEHSV ADYHNHSNIE TRLR — KGGR 

| | 1 1 I I I ! I I I I 

425 435 445 455 . # 465 475 

VRKASGLFDA IWDAFVAAIK LVPTTTGGLV RFVKSIASTV LTVSNGVIIM CAD V PDAFQP 
VRKAEDIFGP CWSALASALK QLKVTTGELV RFVKSICNSA VAVVGGTIQI LASVPEKFLN 
IRKLGDLASA PWEQLKAVVR GLGLLSDEVV LFGKRLSCAT LSIVNGVFEF LADVPEKLAA 
VQKCGALFVD AWKWEELCG SLTLTYKQIY EVVASLCTSA FTIVNYKPTF VVPD-NRVKD 

PGNMIDGFAC TTCGHVYEVG DLIAQSSGVL PVNPVLHTKS AAGYGG FGCKDSFTL 

PGNMIDGFAC TTCGHVYETG DLLAQSSGVL PVNPVLHTKS AAGYGG FGCKDSFTL 

PGNMMDGFLC PGCSKSYMPW ELEAQSSGVI PKGGVLFTQS TDTVN : RESFKL 

INDGLILAAI YSSFSVSELV TALKKGEPFK FLGHKFVYAK DAAVS FTL 

TRCFGGCVFA YVGCYNKRAY WVPRASADIG SGHTGITGDN VETLN EDLLEILS 

....|.. ..I . ... 1 .... I . ... | .... 1 , ... 1 .... | I I 

485 495 505 515 525 535 

VYRTFTQAIC AAFDFSLDVF KIG DVKF KRLGDYVLTE NALVRLTTEV VRGVRDARIK 

AFDVFVTAIQ TVFDCAVETC TIA GKAF DKVFDYVLLD NALVKLVTTK LKGVRE RGLN 

AVTVFVN FLN EFFESACDCL KVG GKTF NKVGSYVLFD NALVKLVKAK ARGPRQAGIC 

LVDKCVKVLV KAFDVFTQII TIAGIEAKCF VLGAKYLLFN NALVKLVSVK ILGKKQKGLE 
YGQTVVYFGG CVYWSPARNI WIP — ILKSS VKSYDSLVYT GVLGCKAIVK ETNLICKALY 
YGQTWYFGG CVYWSPARNI WIP — ILKSS VKSYDGLVYT GVVGCKAIVK ETNLICKALY 
YGHAWPFGS AVYWSPYPGM WLP — VlWfSS VKSYADLTYT GVVGCKAIVQ ETDAICRSLY 
AKAATIADVL RLFQSARVIA EDVWS-SFTE KSFEFWKLAY GKVRNLEEFV KTYVCKAQMS 
RERVNINIVG DFHLNEEVAI ILAS-FSAST SAFIDTIKSL DYKSFKTIVE SCGNYKVTKG 

....|....| ....|....| . . . . I I ....|....| I I 

545 555 565 575 585 595 

KAMFTKVWG PTTEVKFSVI ELATVNLRLV DCAPVVCPKG KIVVIAGQAF FYSGGFYRFM 
KVKYATWVG STEEVKSSRV ERSTAVLTIA NNYSKLFDEG YTVVIGDVAY FVSDGYFRLM 
EVRYTSLWG STTKVVSKRV ENANVNLVW DEDVTLNTTG RTWVDGLAF FESDGFYRHL 
CAFFATSLVG ATVNVTPKRT ETATISLNKV DDVVAPG-EG YIVIVGDMAF YKSGEYYFMM 
LDYVQHKCGN LHQRELLGVS DVWHKQLLLN RGVYKPLLEN IDYFNMRRAK FSLETFTVCA 
LDYVQHKCGN LHQRELLGVS DVWHKQLLLN RGVYKPLLEN IDYFNMRRAK FSLETFTVCA 
MDYVQHKCGN LEQRAILGLD DVYHRQLLVN RGDYSLLLEN VDLFVKRRAE FACK-FATCG 
IVILAAVLGE DIWHLVSQVI YKLGVLFTKV VDFCDKHWKG FCVQLKRAKL IVTETFCVLK 
KPVKGAWNIG QQRSVLTPLC GFPSQAAGVI RSIFARTLDA ANHSIPDLQR AAVTILDGIS 

I . ... I .... I ....I. ...I 

605 615 625 635 645 655 

VDSTTVLNDP VFTGELFYTI KFSGFKLDGF N H QFVNASSATD AIIAVELLLS 
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ASPNSVLTTA 
ADADWIEHP 
SSPNFVLTNN 
DGFMPFLLDD 
DGFMPFLLDD 
DGLVPLLLDG 
GVAQHCFQLL 
EQSLRLVDAM 



VYKPLFAFNV NVMGTRPE 

VYKSACELKP VFECDPIP — D F 

VFKAVKVPSY DIVYDVDNDT KSKMIAKLGS 

LVPRAYYIiAV SGQAFCDY 

LVPRAYYLAV SGQAFCDY — 

LVPRSYYLIK SGQAFTSM 

LDAIHSLYKS FKKCALGR 

VYTSDLLTNS VIIMAYVTG 



KFPTTVTCEN 
PLPVAASVAE 
SFEYDGDIDA 
ADKLCHAVVS 
AGKICHAWS 
MVNFSHEVTD 

IHGDLLF 

— GLVQQTSQ 



LESAVLFVND 
LCVQTDLLLK 
AIVKVNELLI 
KSKELLDVSL 
KSKELLDVSV 
MCMDMALLFM 
WKGGVHKIVQ 
WLSNLIiGTTV 
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665 
DFKTAVFVYT 
KITEFQfcDYS 
NYNTPYKTYS 
EFRQQSLCFR 
DSLGAAIHYL 
DSLGAAIHYL 
HDVKVATKYV 
DGDEIWFDAI 
EKLRPIFEWI 
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725 
LQSNNPQCAI 
ISVLDITDAA 
HEQQDLQGFL 
NELEDIKETN 
HGAYIWESD 
HGAYIWESD 
NGLFAVANGG 
RFKKDENIYY 
CFIDWNKAL 
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675 
CVVDGCSVIV 
IDVIDNEIIV 
CVVRGDKCCI 
AFKDDKSIFV 
NSKIVDLAQH 
NSKIVDLAQH 
KKVTGKLAVR 
DSVDVEDIiGV 
EAKLSAGVEF 
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685 
RRDAT-FATH 
KPNIS-LCVP 
TCTLQ-FKAP 
EAYFKKYKMP 

FSDFG 

FSDFG 

FKALG 

VQEKS ID 

LKDAW 
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695 
VCFKDCYSIW 
LYVRDYVDKW 
SYVEDAVN-F 
ACLAKHIG-L 
TSFVSKIVHF 
TSFVSKIVHF 
VAVVRKITEW 
FEVCDDVTLP 
EILKFLITGV 



705 
EQFCIDNCGE 
DDFCRQYSNE 
VDLCTKNIGT 
WNIIKKDSCK 
FKTFTTSTAL 
FKTFTTSTAL 
FDLAVDTAAS 
ENQPGHMVQI 
FDIVKGQIQV 



....| | 

715 
PWFLTDYNAI 
SWFEDDYRAF 
AGFHEFYITA 
RGFLNLFNHL 
AFAWVLFHVL 
AFAWVLFHVL 
AAGWLCYQLV 
EDDGKNYMFF 
ASDNIKDCVK 
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735 
VQASESK — V 
VKAAESK — A 
TTCCTMSGFE 

IQAIKN - 

IYFVKN 

IYFGKN 

ITFLSD 

TPMSQLG 

EMCIDQ 
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785 
LTLTSNGLLG 
LNLTQQGLLG 
VSFGLDGIVV 
LLIGNG — VK 
RICLSGRKIY 
RICLSGSKIY 
RVCLAGCKVY 
IECCGEPWNT 
QLLMPLKAPK 
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795 
NCAKRFRRVL 
TCAKRFKRWL 
TVARKFKRLG 
WCDGCKGFA 
EVERGLLHSS 
EVERGLLHSS 
EVVQKRLSAY 
IFKKAYKEPI 
EVTFLEGDSH 



745 
LLERFLPKCP 
FVDTIVPPCP 
CFMPTIPQCP 

ILCP 

-IPRYASAVA 
-IPRYASAVA 
-VPELVKNFV 

AINVVCK 

VTIAG 



755 
EILLSIDDGH 
SILKVIDGGK 
AVLEEIDGGS 
DPLLDLDYGA 
QAFQSVAKW 
QAFRSGAKVG 
DKFKVFFKVL 
AGGKTVTFG- 
AKLRSLNLGE 



765 
1WNLFVEKFN 
IWNGVIKNVN 
IWRSFITGLN 
IWYNCMPGCS 
LDSLRVTFID 
LDSLRVTFID 
IDSMSVSVLS 
— ETTVQEIP 
VFIAQSKGLY 
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775 
FVTDWLKTLK 
SVRDWLKSLK 
TMWDFCKRLK 
DP-SVLGSVQ 
GLSCFKIGRR 
GLSCFKIGRR 
GLTVVKTASN 
PPDVVPIKVS 
RQCIRGKEQL 



805 
VKLLDVYNGF 
GILLEAYNAF 
ALLAEMYNTY 
NQLSKGYNKL 
QLPLDVYDLT 
QLPLDVYDLT 
VMPVGCNEAT 
EVDTDLTVEQ 
DTVItTSEEW 



815 
LETVCSVVHT 
LDTWSTVKI 
LSTWENLVL 
CNAARNDIEI 
MPSQVQKAKQ 
MPSQVQKTKQ 

LLSVIYEKMC 
LKNGELEALE 
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AGVCIKYYAV 
GGLTFKTYAF 
AGVSFKYYAT 
GGIPFSTFKT 
KPIYLKGSGS 
KGIYLKGSGS 

LVGEIE 

DDLKLFPEAP 
TPVDSFTNGA 
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835 
NVP-YWISG 
DKP-YIVIRD 
SVP-KIVLGG 
PTNTFIEMTD 
DFSLADSWE 
DFSLADSVVE 
PAVVEDDWD 
EPPPFENVAL 
IVGTPVCVNG 
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FVSRVIRRER 
IVCKVENKTE 
CFHSVKSVFA 
AIYSVIEQGK 
WTTSLTPCG 
WTTSLTPCG 
WKAPLTYQG 
VDKNGKDLDC 
LMLLEIKDKE 
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855 
CD — VTFPCV 
AEWIELFPHN 
SV — FQIPVQ 
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YS EPP 

YS EPP 

CC KPP 
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865 
SCVTFFYEFL 
DRIKSFSTFE 
AGIEKFKVFL 

S FR 

KVADKICIVD 
KVADKICIVD 
TSFEKICWD 

CHLI 

CALS 



875 
DTCFGVSK — 
SAYMPIAD — 

NCVHPVV 

DADVPWDNG 
NVYMAKAGDK 
NVYMAKAGDK 
KLYMAKCGDQ 

YRDYESD 

PGLLATN 



885 
— PNAIDVEH 
— PTHFDIEE 
— PRVIETSF 
TISTADWSEP 
YYPVWD-DH 
YYPWVD-GH 
FYPVWDNDT 

DD 

NV 
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895 
LELKETVFVE 
VELLDAEFVE 
VELEETTFKP 
ILLEPAEYVK 
VGLLDQAWRV 
VGLLDQAWRV 
IGVLDQCWRF 
IEEEDAEECD 
FRLKGG API K 
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905 
PKDGGQFFVS 
PGCGGILAVI 
PALNGGIAIV 
PKNNGNVIVI 
PCAGRRVTFK 
PCAGRCVTFK 
PCAGKKVEFN 
TDSGEAEECD 
GVTFGEDTVW 
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915 

DDYLWYW-D D- 
DEHVFYKK-D G- 
DGFAFYYD-G T- 
AGYTFYKDED E- 
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IY 

VY 

LY 

HF 



EQPTVKEIIS MPKIIKVFYE 
EQPTVNEIAS TPKTIKVFYE 
DKPKVKEIPS T-RKIKINFA 

-TNSECEEEDE" D TK 

EVQGYKNVRI T FE 
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935 
YPASCNGVLP 
YPSNGTNILP 
YPTDGNSVVP 
YPYGFGKIVQ 
LDNDFNTILN 
LDKDFNTILN 
LDATFDSVLS 
VLALIQDPAS 
LDERVDKVLN 
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VAFTKLAGGK 
VAFTKAAGGK 
ICFKKKGGGD 
RMYNKMGGGD 
TACGVFEVDD 
TACGEFEVDD 
KACSEFEVDK 
IKYPL'PLDED 
EKCSVYTVES 
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ISFSDDV 

VSFSDDV 

VKFSDEV 

KT-VSFSEEV 
TVDMEEFYAV 
TVDMEEFYAV 
DVTLDELLDV 
YS-VYNGCiV 
GTEVTEFACV 
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965 
IVHDVEPTHK 
EVKDIEPVYR 
SVKTIDPVYK 
DVQEIAPVTR 
VIDAIEEKLS 
VIDAIEEKLS 
VLDAVESTLS 
HKDALDWNL 
VAEAWKTLQ 
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975 
VKLIFEFEDD 
VKLCFEFEDE 
VSLEFEFESE 
VKLEFE FDNE 
PCKELEGVGA 
PCKELEGVGA 
PCKEHDVIGT 
PSGEETFVVN 
PVSDLLTN-- • 
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985 
-VVTSLCKKS 
— KLVDVCEKA 
-TIMAVLNKA 
-IVTGVLERA 
-RVSAFLQKL 
-KVSAFLQKL 
-KVCALLNRL 
NCFEGAVKPL 
— MGIDLDEW 
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995 
FGKSIIYTG- 
IGKKIKHEG- 
VGNRIKVTG- 
IGTRYKFTGT 
EDNPLFLFD- 
EDNSLFLFD- 
AEDYVYLFD- 
PQKVVDVLG- 
SVATFYLFD- 



1005 
DWEGLHEVLT 
DWDSFCKTIQ 
GWDDVVEYIN 
TWEEFEESIS 
— EAGEEVLA 
— EAGEEVLA 
— EGGEEVIA 
— DWGEAVDA 
— DAGEENFS 
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SAMNVIG 

SALSVVS 

VAIEVLK 

EELDAIFDTL 
PKLYCAFTAP 
PKLYCAFTAP 
PKMYCSFSAP 

QEQLCQQ 

SRMYCSFYPP 
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1025 1035 1045 1055 1065 1075 

EMCR — QHIKLPQF YIYDEEGGYD VSKP— VMIS QWPISDDSDG CWEASTDFH Q — LESVREE 

5 22 95 — CYVNLPTY YIYDEEGGND LSLP — VMIS EWPLSVQQAQ QEATLPDIAE D— WDQVEE 

PEDV — DHVEVPKY YIYDEEGGTD PNLP — VMVS QWPLNDDTIS QDLLDVEVVT DAP I DSEGDE 

TGEV ANQGVELEGY FIYDTCGGFD IKNPDGIMIS QYDINITADE KSEVSASSEE EE— VE S VEE D 

0C43 EDDDFLEESD VEEDDVEGEE TDLTVTSAGQ PCVASEQEES SEVLEDTLDD GPSVETSDSQ 

BOCOV EDDDFLEESG VEEDDVEGEE TDLTVTSAGE PCVASEQEES SEILEDTLDD GPCVETSDSQ 

If) mhv DDEDCVAADV VDADENQGDD ADDS AALVT D TQEEDGVAKG QVGVAESDAR LDQVEAFDIE 

AIPV — EPLQHTFE EPVENSTGSS KTMTEQWVE DQELPWEQD QDWVYTPTD LEVAKETAEE 

SARS CoV DEEEEDDAEC EEEEIDETCE HEYGTEDDYQ GLPLEFGASA ETVRVEEEEE EDWLDDTTEQ 

....|....| . ... | .... I .-..|.-.. | 

1 c; 1085 1095 1105 1115 1125 1135 

EMCR VD HE QPFGEVEHAL SIRQ 

229E VNS IFD IETVDVKHDV S 

pEDV VDSSAPEKVA D VANSEPGDDG LPVAPETNVE SEVEEVAATL SFIKDTPSTV 

TGEV PENEIVEASE GAEGTSSQEE VET VE V ADI T STEEDVDIVE VSAKDDPWAA AVDVQEAEQF 

20 OC43 VEEDVEMS DFVDL ESVIQD YENVCFEF YTT 

BoCoV VEEDVQMS DFGDL ESVIQD YENVCFEF YTT 

MHV KVEDPILN ELSAE LNAPADK TYEDVLAFD AIYSEALSAF YAVP 

AiPV VD — — ' 

SARS CoV SEIEPEP E PTPEEPVNQF TG 

25 ....|....| I f ....I. ...I . ... I .... I I ....I. ...I 

1145 1155 1165 1175 1185 1195 

EMCR PFSFSFR DELGVRVLDQ SDNNCWISTT LIQLQLTKLL DDSIEMQLFK VGKVDSIVQK 

229E PFEMPFE ELNGLKILKQ LDNNCWVNSV MLQIQLTGIL DGDYAMQFFK MGRVAKMIER 

30 PEDV TKDPFAFDFV SYGGLKVLRQ SHNNCWVTST LVQLQLLGIV DDP-AMELFS AGRVGPMVRK 

TGEV NPSLPPFKTT NLNGKIILKQ GDNNCWINAC CYQLQAFDFF NNE - AWEKFK KGDVMDFVNL 

OC43 EPEFV KVLGLYVPKA TRNNCWLRSV LAVMQKLFCQ FKD — KNLQD LWVLYKQQYS 

BoCoV EPEFV KVLDLYVPKA TRNNCWLRSV LAVMQKLPCQ FKD— KNLQD LWVLYKQQYS 

MHV GDETHF KVCGFYSPAI ERTNCWLRST LIVMQSLPLE FKD — LEMQK LWLSYKSSYN 

35 AIPV EFILIFAVPK EEWSQKDGA QIKQEPIQW KPQ — REKKA KKFKVKPATC 

SARS CoV YLKLTD NVAIKCVDIV KEAQSANPMV IVNAANIHLK HGGGVAGALN KATNGAMQKE 

. ... | .... I ....|....| 1 I 

1205 1215 1225 - . 1235 1245 1255 

40 EMCR CYELSHLISG SLGDSGKLLS ELLKDKYTCS ITFEMSCDCG KKFDEQVGCL FWIMPYTKLF 

229E CYTAEQCIRG AMGDVGLCMY RLLKDLHTGF MVMDYKCSCT SGRLEESGAV LFCTPTKKAF 

p ED V CYESQKAILG SLGDVSACLE SLTKDLHTLK ITCSVVCGCG TGERIYEGCA FRMTPTLEPF 

TGEV CYAATTLARG HSGDAEYLLE LMLNDYSTAK I VLAAKCGCG EKE I VLERAV FKLTPLKESF 

OC4 3 QLFVDTLVNK IPANIVLPQG GYVADFAYWF LTLCDWQCVA YWKCIKCDLA LKLKGLDAMF 

4 5 BOCOV QLFVDTLVNK IPANIVVPQG GYVADFAYWF LTLCDWQCVA YWKCIKCDLA LKLKGLDAMF 

M HV KEFVDKLVKS VPKSIILPQG GYVADFAYFF LSQCSFKAYA NWRCLKCDMD LKLQGLDAMF 

AIPV EKPKFLEYKT CVGDLTWIA KALDEFKEFC IVNAANEHMT HGSGVAKAIA DFCGLDFVEY 

SARS COV SDDYIKLNGP LTVGGSCLLS GHNLAKKCLH WGPNLNAGE DIQLLKAAYE NFNSQDILLA 

50 I I ! I. ---I ....I. ...I I I I 

1265 1275 1285 1295 1305 1315 

EMCR QKGECCICHK MQTYKLVSMK GTGVFVQD — PAPIDIDAFP VRPICSSVYL GVKGSGHYQT 

22 9E PYGTCLNCNA PRMCTIRQLQ GTIIFVQQK- PEPVNPVSFV VKPVCSSIFR GAVSCGHYQT 

p EDV PYGACAQCAQ VLMHTFKSIV GTGIFCRD — TTALSLDSLV VKPLCAAAFI GK-DSGHYVT 

55 TGEV NYGVCGDCMQ VNTCRFLSVE GSGVFVHDIL SKQTPEAMFV VKPVMHAVYT GTTQNGHYMV 

OC43 FYGDWSHIC KCGESMVLID VDVPFTARFA LKDKLFCAFI TKRIVYKAAC VVDVNDSHSM 

BOCOV FYGDWSHVC KCGESMVLID VDVPFTAHFA LKDKLFCAFI TKRSVYKAAC VVDVNDSHSM 

MHV FYGDWSHVC KCGTGMTLLS ADI PYTLHFG LRDDKFCAFY TPRKVFRAAC VVDVNDCHSM 

AIPV CEDYVKKHGP QQRLVTPSFV KGIQCVNNW GPRHGDNNLH EKLVAAYKNV LVDGWNYVV 

60 SARS CoV PLLSAGIFGA KPLQSLQVCV QTVRTQVYIA VNDKALYEQV VMDYLDNLKP RVEAPKQEEP 

, ... 1 .... 1 . ... I .... I ....|....| ..I ....I I 

1325 1335 1345 1355 1365 1375 

EMCR NLYSFDKAID GFGVFDIK NSSV NTVCFVDVDF HS-VEIEAGE 

65 22 9E NIYSQNLCVD GFGVNKIQP WTNDAL NTICIKDADY NAKVEISVTP 

PEDV NFYDAAMAID GYGRHQIK YDTL NTICVKDVNW TAPLVPAVDS 

TGEV DDIEHGYCVD GMGIKPLKKR CYTSTLFINA NVMTRAEKPK QEFKVEKVEQ QPIVEENKSS 
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FVGKIAQWIK 
FVGKIVQWIK 
FVGQIVAWVK 
LCGPYKDYGK 
AFGVLLSNFG 

I 



2435 
KYNIRSALFV 
KYNLKAS AAV 
RYNAKALGVF 
GAKVRTLNYM 
KNAFLTFKWS 
KNAFLT FKWS 
KNALQTFNWN 
KASVKSVVAS 
KNSVKSVAKL 

I 



2445 
VKQKWC-VIV 
LKSKWW-LLA 
FKLKLY-WFK 
.RQLNKP-SVW 
MVARGA-CII 
VVARGA-CII 
WSRGF-FLV 
YKTVLCKVVL 
CLDAGI-NYV 



I I 

2455 
TLFKFLLLLY 
KFTKLLLLIY 
VLGKFSIiGIY 
RYAKLVLLLI 
ATIFLLWFNF 
ATI FLLWFNF 
ATVFLLWFNF 
ATLLIVWFVY 
KSPKFSKIiFT 



2495 

STF 

SNF 

SSF 

SSF 



2505 



I 



2545 

KHIR DP 

KHIT DP 

QHLR- DP 

DFKS---- DP 

YEAD RR 

YEAD RR 

HWD RR 

QVYKDAASGF 
VTISSYKLDL 

SFLGFHQKQW 
VFLGYKETNW 
VFLGLQQSIW 
SYFGYVEYSW 
LHWSFRLLVA 



NK 

VK 

DK 

IK 

NTFSLVTICD LYSMQDVGFK 
NTFSLVTICD IiYSIQDVGFK 
TTFGIFTLCD LYQVSDVGYR 

DSFD VIi 

APSYCNGVRE LYLNSSNVTT 



I 



2515 
DIYCGNSMVC 
DDYCDGSLGC 
NEYCN-SVTC 
SAVCGNSILC 
NQYCNGSIAC 
NQYCNGSIAC 
SSFCNGSMVC 
R-YCADDFIC 
MDFCEGSFPC 



2555 
ILISLQPFVI 
LFSNMQPFIV 
LIGNVMPFFY 
LWNRLVQLSY 
AFVDYTGVLK 
AFVDYTGVLK 
VSFDYISLFK 
IFNWNWLYLV 
TILGLAAEWV 

«.».| .... J 

2615 
FLHFVPFDVL 
FLHFIPFDVI 
FLQLVPFDVF 
FLHWNFESI 
LANMLPAHVF 



2575 



I 



2565 

LVILLIFG 

MVLLLIFG 

IiAFLAIFG " 

FAFLAVSSS-- - 
IVIELIVSYA LYTAWFYPLF 
IVIELIVSYA LYTAWFYPLF 
LWELVIGYS LYTVCFYPLF 

FLILFVKP 

LAYMLFTKFF YLLG L 



2625 
CNEFLATFIV 
CDELLVTVIV 
GDEIVVFFIV 
SAEFVIVVIV 
MRFYIIIASF 



I I 

2635 
CKIVLFVRHI 
IKVISFVRHV 
TRVLMFIKHV 
VKAVLALKHI 
IKLFSLFRHV 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



70 



75 



BOCOV 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OC43 

BoCoV 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OC43 

BoCoV 

MHV 

AIPV 

SARS COV 



EMCR 

229E 

PEDV 

TGEV 

OC43 

BoCoV 

MHV 

AIPV 

SARS CoV 



EMCR 

22 9E 

PEDV 

TGEV 

OC43 

BoCoV 

MHV 

AIPV 

SARS COV 



229E 

PEDV 

TGEV 

OC43 

BoCoV 

MHV 

AIPV 

SARS COV 



EMCR 

229E 

PEDV 

TGEV 

OC43 

BoCoV 

MHV 

AIPV 

SARS COV 



ALISIQILTT WLPELLMLST LHWSVRLLVS LANMLPAHVF MRFYIIIASF IKLFSLFRHV 
GLIGMQLLTT WLPEFFMLET MHWSARFFVF VANMLPAFTL LRFYIWTAM YKIFCLCRHV 
VAGFVIICYC VKYLVLNSTV LQTGVCFLDW FVQTVFSHFN FMGAGFYFWL FYKIYIQVHH 
SAIMQVFFGY FASHFISNS WLMWFIIS IVQMAPVSAM VRMYIFFASF YYIWKSYVHI 



. ... I .... I 

2645 
IVGCNNADCV 
LFGCENPDCI 
CLGCDKASCV 
VFACSNPSCK 
AYGCSKSGCL 
AYGCSKSGCL 
MYGCSRPGCL 
ILYCKDVTCE 
MDGCTSSTCM 



i 



.1. 



I 



2655 
ACSKSARLKR 
ACS KS ARLKR 
ACSKSARLKR 
TCSRTARQTR 
FCYKRNRSLR 
FCYKRNRSLR 
FCYKRNRSVR 
VCKRVARSNR 
MCYKRNRATR 



|.. 

2705 
GNTFINGDIA. 
GSTFITPEVS 
GCTFINDVIA 
ENTFICDEIV 
GNTFITVEAA 
GNTFITVEAA 
GNTFITHEAA 
QNTFMSPEVA 
GSTFISDEVA 



I I 

2715 
RELGNWKTA 
RELGNITKTN 
TEVGNVVKLN 
RDLSNSVKQT 
LDLSKELKRP 
LDLS KELKRP 
ADLSKELKRP 
GELSEKLKRH 
RDLSLQFKRP 



I.. 

2665 
VPLQTIINGM 
FPVNTIVNGV 
VPVQTIFQGT 
IPIQVVVNGS 
VKCSTIVGGM 
VKCSTIVGGM 
VKCSTWGGT 
QEVSVWGGR 
VECTTIVNGM 

I 



...,|....| 

2675 
HKSFYVNANG 
QRSFYVNANG 
SKSFYVHANG 
MKTVYVHANG 
IRYYDVMANG 
IRYYDVMANG 
LRYYDVMANG 
KQIVHVYTNS 
KRSFYVYANG 



I 



2725 
VQPTAPAYVI 
VQPTGPAYVM 
VQPTGPATIL 
VYATDRSHQE 
IQPTDVAYHT 
IQPTDVAYHT 
VNPTDSAYYL 
VKPTAYAYHV 
INPTDQSSYI 



|.. 

2735 
IDKVDFVNGF 
IDKVEFENGF 
IDKVEFSNGF 
VTKVECSDGF 
VTDVKQVGCS 
VTDVKQVGCY 
VTEVKQVGCS 
VDEACLVDDF 
VDSVAVKNGA 



I | 

2685 
GTCFCNKHNF 
GSKFCKKHRF 
GSKFCKKHNF 
TGKFCKKHNF 
GTGFCSKHQW 
GTGFCSKHQW 
GTGFCAKHQW 
GYNFCKRHNW 
GRGFCKTHNW 

! 



I I 

2695 
FCVNCDSFGP 
FCVDCDS YGY 
FCLNCDSYGP 
YCKNCDSYGF 
NCIDCDSYKP 
NCIDCDSYKP 
NCLNCSAFGP 
YCRNCDDYGH 
NCLNCDT FCT 



2745 
YRLYSGDTFW 
YRLYSCETFW 
YYLYSGDTFW 
YRFYVGDEFT 
MRLFYDRDGQ 
MRLFYDRDGQ 
MRLFYERDGQ 
VNLKYKAATP 
LHLYFDKAGQ 



| | 

2755 
RYDFDITESK 
RYNFDITESK 
KYNFDITDSK 
S YD YDVKHKK 
RT YDDVNAS I* 
RTYDDVNASL 
RVYDDVSASL 
GKDSASSAVK 
KTYERHPIiSH 



. ... 1 .... I I • I . ... | .... 1 

2765 2775 2785 2795 

YSCKEVLKN CNVLENFIVY NNSGS — NIT 

YSCKEVFKN CNVLDDFIVF NNNGT — NVT 

YTCKEALKN- CSIITDFIVF NNNGS — NVN 

YSSQEVLKS MLLLDDFIVY SPSGS — ALA 

FVDYSNLLHS KV KSVPNMHVVV VENDA — DKA 

FVDYSNLLHS KV KSVPNMHVVV VENDA — DKA 

FVDMNGLLHS KV KGVPETHVW VENEA — DKA 

CFSVTDFLKK AVFLKEALKC EQISNDGFIV CNTQSAHALE 
FVNLDNLRAN NT KGSLPINVIV FDGKSKCDES 



.... |...,| 

2805 
QIKNACVYFS 
QVKNASVYFS 
QVKNACVYFS 
NVRNACVYFS 
NFLNAAVFYA 
NFLNAAVFYA 
GFLNAAVFYA 
EAKNAAIYYA 
ASKSASVYYS 



....|....| 

2815 
QLLCEPIKLV 
QLLCRP IKLV 
QMLCKPVKLV 
QLIGKPIKIV 
QSLFRPILMV 
QSLFRPILMV 
QSLYRPMLLV 
QYLCKPILIL 
QLMCQPILLL 



.,..| 1 

2825 
NSELLSTLS- 
DSELLSTLS- 
DSALLASLS- 
NSDLLEDLS- 
DKNLITTANT 
DKILITTANT 
EKKLITTANT 
DQALYEQLVV 
DQVLVSDVGD 

I- 



I 1 

2835 
— VDFNGVLHK 
-VDFNGVLHK 
— VDFGAS LHS 
— VDFKGALFN 
GTSVTETMFD 
GTSVTETMFD 
GLSVSQTMFD 
-EPVSKSVID 
STEVSVKMFD 



| 1 

2845 
AYVDVLCNSF 
AYIDVLRNSF 
AFVSVLSNSF 
AKKNVIKNSF 
VYVDTFLSMF 
VYVDTFLSMF 
LYVDSLLGVL 
KVCSILSSII 
AYVDTFSATF 



| | 

2855 . 
FKELTANMSM 
GKDLNANMSL 
GKDLSSCNDM 
NVDVSECKNL 
DVDKKSLNAL 
DVDKKSLNAL 
DVDRKSLTSF 
SVDTAALNYK 
SVPMEKLKAL 



[ I 

2865 
AECKATLGLT 
AECKRALGLS 
QDCKSTLGFD 
DECYRACNLN 
IATAHSSIKQ 
IATAHSSIKQ 
VNAAHNSLKE 
AGTLRDALLS 
VAT AH S ELAK 



2875 



I 



GTQIYKVLDT 
GTQICKVLDT 
GVQLEQVMDT 

GVALDGVLST 



2885 



....|....| ....!.... I . ... I .... I ..| 
2895 2905 2915 2925 

VSDDDF VSAVANAHRY DVLLSDLSFN NFFISYAKPE 

ISDHEF TSAISNAHRC DVLLSDLSFN NFVSSYAKPE 

VPLDTF NAAVAEAHRY DVLLTDMSFN NFTTSYAKPE 

VSFSTF EMAVNNAHRF GILITDRSFN NFWPSKVKPG 

FLSCARKSCS IDSDVDTKCL ADSVMSAVSA GLELTDESCN NLVPTYLKSD 
FLSCARKSCS IDSDVDTKCL ADSVMSAVSA GLELTDESCN NLVPTYLKGD 
FIGCARRKCA IDSDVETKSI TKSIMSAVNA GVDFTDESCN NLVPTYVKSD 

ITKDEEA VDMAIFCHNH DVDYTGDGFT NVIPSYGIDT 

FVSAARQG-V VDTDVDTKDV IECLKLSHHS DLEVTGDSCN NFMLTYNKVE 



I | 

2935 
DK-LSVYDIA 
EK-LSAYDLA 
EK-FPVHDIA 
SSGVSAMDIG 
N — IVAADLG 
N — IVAADLG 
T— IVAADLG 
G-KLTPRDRG 
N— MTPRDLG 



....|....| 

2945 
CCMRAGSKVV 
CCMRAGAKW 
TCMRVG AKI V 
KCMTSDAKIV 
VLIQNSAKHV 
VLIQNSAKHV 
VLIQNNAKHV 
FLINADASIA 
ACIDCNARHI 



| 1 

2955 
NHNVLIKESI 
NANVLTKDQT 
NHNVLVKDSI 
NAKVLTQRGK 
QGNVAKIAGV 
QGNVAKIAGV 
QANVAKAANV 
NLRVKN--AP 
NAQVAKSHNV 



....|....| 

2965 
PIVWGVKDFN 
PIVWHAKDFN 
PWWLVRDFI 
SVVWLSQDFA 
SCIWSVDAFN 
SCIWSVDAFN 
ACIWSVDAFN 
PVVWKFSELI 
SLIWNVKDYM 



....|. ...I 

2975 
TLSQEGKKYL 
SLSAEGRKYI 
ALSEETRKYI 
ALSSTAQKVL 
QFSSDFQHKL 
QLSSDFQHKL 
QLSADLQHRL 
KLSDSCLKYL 
SLSEQLRKQI 



2985 
VKTTKAKGLT 
VKTSKAKGLT 
IRTTKVKGIT 
VKTFVEEGVN 
KKACCKTGLK 
KKACCtfTGLK 
RKACSKTGLK 
ISATVKSGVR 
RSAAKKNNIP 



....|.... I 

2995 
FLLTFNDNQA 
FLLTINENQA 
FMLTFNDCRM 
FSLTFNAVGS 
LKLTYNKQMA 
LELTYNKQMA 
IKLTYNKQEA 
FFITKSGAKQ 
FRLTCATTRQ 



80 



EMCR 



....|....| ....!....| 

3005 3015 
ITQVP A TSIVAKQGAG 



I 



3025 



....|....| ....|....| ....|....| 

3035 3045 3055 
FKRTYNFLWY VCLFVVALFI GVSFID 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



70 



75 



80 



229E 

PEDV 

TGEV 

OC43 

BOCOV 

MHV 

AIPV 

SARS CoV 



EMCR 

22 9E 

PEDV 

TGEV 

OC43 

BoCoV 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OC43 

BoCoV 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

0C43 

.BOCOV 

MHV 

AIPV 

SARS COV 



EMCR 

229E 

PEDV 

TGEV 

0C43 

BoCoV 

MHV 

AIPV 

SARS COV 



EMCR 

229E 

PEDV 

TGEV 

OC43 

BoCoV 

MHV 

AIPV 

SARS COV 



EMCR 

229E 

PEDV 

TGEV 

OC43 

BoCoV 

MHV 

AIPV 

SARS CoV 



VTQIP A 

HTTIP — T 

DDDLPYERFT 

NVSVL T 

NVSVL T 

NVPIL T 

VIACHT — QK 
VVNVI T 

• • • • | • a • • | 

3065 
-YTTTVTSFH 
FMYDIVSSFE 
-FSTQVSSDS 
ATQSYIESAE 
— KSDFQLPV 
— KSDFQLPV 
— KSDMQLPL 
PMYDVNSTLH 
— IHDGYTNE 



TSIVAKQGAG D AGHSLTWLWL 

VCIANKKGAG LP S FSKVKKFFWF 

ESVSPKSGSG FFDVITQLKQ 

TPFSLKGGAV FS Y FVYVCFVLSL 

TPFSLKGGAV FS Y FVYVCFVLSL 

TPFSLKGGAV FS K VLQWLFWNL 

LLVEKKAGGI VSGTFKCFKS YFKWLLIFYI 
TKISLKGGKI VS T CFKLMLKATL 



•*..|.*..| 
3125 

VVG VSER 

WG VSEI 

WG VSDE 

WGTVFDLEN 
WA-VIDQDF 
WA-WDQDF 
VVA-VIDQDI 
VTA — VIDGD 
VAA-IITREI 



••••(••••I 

3075 
GYDFKYIENG 
GYDFKYIENG 
DYDFKYIESG 
GYDYMVIKNG 
YASYKVLDNG 
YASYKVLDNG 
YASFKVIDNG 
VEGFKVIDKG 
I IGYKAIQDG 

| | 

3135 
INWPGVPTN 
VNTVAGIPSN 
ART V PG I PAG 
MRPIPDVPAY 
GSTVFNVPTK 
GSTVFNVPTK 
GYTLFNVPTK 
GTVATGVPGF 
GFIVPGLPGT 



• ... I .... ) •••»!. ...j 

3085 3095 
QLKVFEAPLH CVRNVFDNFN 
QLKNFEAPLK CVRNVFENFE 
QLKTFDNPLS CVHNVFINFD 
IVQPFDDTIS CVHNTYKGFG 
VIRDVSVEDV CFANKFEQFD 
VIRDVSVEDV CFANKFEQFD 
VLRDVTVTDA CFANKFIQFD 
VLREIVPEDT CFSNKFVNFD 
VTRDIISTDD CFANKHAGFD 

. ... | .... | . . . . | . . „ . | 
3145 3155 

VYLVG KTLV 

VYLVG KTLI 

VYLAG KTLV 

VSIVG RSLV 

VLRYG YHVL 

VLRYG YHVL 

VLRYG FHVL 

VSWVMDGVMF IHMTQTERKP 
VLRAIN GDFL 



I | | | 

3185 3195 

TS DK C I FN S ACT RL 

TP EK CIFTSACTRL 

DK GA CIFNSACTTL 

VSKD-SYFDT CVFNTACTTL 
ISYSNFYASG CVLSSACTMF 
ISYSNFYASG CVLSSACTMF 
IPYDNFYASG CVLSSLCTML 
TEG-SFYTSI ALFSARCLYL 
IEYSDFATSA CVLAAECTIF 



3245 
YVRFPEILAR 
FIKLPEVIAQ 
AVSLPEIISR 
MVKLPAIIR- 
FIRFPEVLRE 
FIRLPEVLRE 
YIRFPEWSE 
LIVPQQILHT 
I I QFPNTYLE 

3305 
GDGLIDLLVN 
GTGLWNLVFN 
GTGLFTLLMN 
GNSVLGFFKN 
GRDVFDLIYQ 
GRDVFDLIYQ 
GRNAFDLIHQ 
-GSTVRELMFS- 
GVDAMNLIAN 

• . . » | . w • • | 

3365 
TWCATLINN 
TVWAVLLNN 
TVGACTLLNN 
MIIVTLWNN 
VNVIVWCVNF 
VNVIVWCVNF 
INVIVWCINF 
ITMLVWVINA 
ANALLFLMSF 



* 3255* * ' 
GFGLRTIRTL 
GFGFRTVRTI 
GFGIRTIRTK 
GLGLRFVKTQ 
GL-VRIVRTR 
GL-VRIVRTR 
GI-VRIVRTR 

PY WKFV 

GS-VRWTTF 

* > • t | • » • » | 

3315 
VLSIFSSSFS 
ILSMFSSSFS 
VISVFSKTVP 
VFKLFNSNMS 
LFKGLAQPVD 
LFKGLAQPVD 
VLGGLVRPID 
MVSTFFTGVN 
IFTPLVQPVG 

•••■{•••■[ 

3375 
ISYWTQN-L 
VSYIVTQN-L 
VSYIVTQN-T 
VSYFVTQN-T 
MMLFVFQVYP 
MMLFVFQVYP 
LMLFVFQVYP 
FILCVHSYNS 
TILCLVPAYS 



• . | 

3205 
EGLGGD-NVY 
EGLGGN-NVY 
SGLGGT-AVY 
TGLGGT-LVY 
TMADGSPQPY' 
AMADGSPQPY 
AHADGTPHPY 
TASNTP-QLY 
KDAMGKP VP Y 

». | 

3265 
ATRYCRVGEC 
ATKYCRVGEC 
AMTYCRVGQC 
ATTYCRVGEC 
SMSYCRVGLC 
SMSYCRVGLC 
SMTYCRVGLC 
SDSYCRGSVC 
DAEYCRHGTC 

-...| ....| 

3325 
VVAMSGHMLF 
VAAMSGQILL 
VTVLSGQILF 
VVATSGAMLV 
FLALTASSIA 
FLALTASSIA 
FFALTASSVA 
-PNIYMQLAT- 
ALDVSASVVA 

I I 

3385 
FFMLLYAILY 
VTMIAYAILY 
LGMLGYATLY 
FFMIIYAIVY 
ILSCVYAICY 
TLSCVYAICY 
TLSCLYACFY 
VLAVILLVLY 
FLPGVYSVFY 



I | 

3215 
CYN-TDLIEG 
CYN— TALMEG 
CYK-NGLVEG 
CAK-QGLVEG 
CYT-EGLMQN 
CYT-DGLMQN 
CYT-EGIMHN 
C FNG DNDAEG 
CYD-TNLLEG 

•.••[... » | 

3275 
RDSHKGVCFG 
VESNAGVCFG 
VQSAEGVCFG 
IDSKAGFCFG 
EEADEGICFN 
EEADEGICFN 
EDAEEGVCFN 
EYTRPGYCVS 
ERSEVGICLS 



LCGLVCLIQF 
LCLFIVAAFF 
IVILVFVFIF 
VCFIGLWCLM 
VCFIGLWCLM 
ICFIVLWALM 
LFTACCSGYY 
LCVLAALVCY 

I | 

3105 
QWHEAKFGVV 
DWHYAKFGFT 
QWHDAKFGFT 
DWFKAKYGFI 
QWYESTFGLS 
QWYESTFGLS 
QWYESTFGLV 

AFWG RP 

AWFSQRGGSY 

•••«!••»• I 

3165 
FTLQAAFGNT 
FTLQAAFGNA 
FAINTIFGTS 
FAINAAFGVT 
HFITHALSAD 
HFITHALSAD 
HFITHAFATD 
WYIPTWFNRE 
HFLPRVFSAV 

•»••!■•»•! 

3225 
SKPYSILQPN 
SLPYSSIQAN 
AKLYSELAPH 
AKLYSDLMPD 
ASLYSSLVPH 
ASLYSSLVPH 
ASLYDSLAPH 
ALPFGSIIPH 
SISYSELRPD 

• •-••{••••I 

3285 
FDKWYVNDGR 
FDKWFVNDGR 
ADRFFVYNAE 
GDNWFVYDNE 
FNGSWVLNND 
FNGSWVLNND 
FNSSWVLNNP 
LNPQWVLFND 
TS GRWVLNNE 



YLCFFMPY — 

ALSFLD 

ICGLCSVYSV 

PTYTVH 

PTYTVH 

PTYAVH 

YMEVSKSFVH 
IVMPVHTLS- 

I | 

3115 
TTNSD-KCPI 
PLNKQ-SCPI 
PVNNP-SCPI 
PTFGK-SCPI 
YYSNSMACPI 
YYSNSMACPI 
YYRNSRACPV 
YDNSR-NCPI 
KNDKS — CPV 



••••)», 

3175 
GVCYDFDGVT 
GVCYDIFGVT 
GLCFDASGVA 
NMCYDHTGNA 
GVQCYTPHSQ 
GVQCYTPHSQ 
SVQCYTPHMQ 
IVGYTQDSII 
GNICYTPSKL 

3235 
AYYKYDVKN- 
AYYKYDNGN- 
SYYKMVDGN- 
YYYEHASGN- 
VRYNLANAKG 
VRYNLANAKG 
VRYNLANSNG 
RVYFQPNGVR 
TRYVLMDGS- 



I | 

3295 

VD DGYIC 

VA NGYVC 

SG SDFVC 

FG NGYIC 

YYRSLPGTFC 
YYRSLPGTFC 
YYRAMPGTFC 
EYTSKPGVFC 
HYRALSGVFC 



...... ...| 

3335 
NFLFAAFITF 
NCALGAFAIF 
NCIIAFVAVA 
NIIIACLAIA 
GAILAVIWL 
GAILAVIVVL 
GAILAIIVVL 
MFLILVWVL 
GGIIAILVTC 

I | 

3395 
FVFTRTVR — 
FFATRSLR — 
FLCTKGVR — 
YFITRKLA — 
FYATLYFPSE 
FYATLYFPSE 
FYTTLYFPSE 
CYASLVTSRN 
LYLTFYFTND 



....(»••, | 

3345 
LCFLVTKFKR 
CCFLVTKFRR 
VCFLFTKFKR 
MCYGVLKFKK 
VFYYLIKLKR 
GFYYLIKLKR 
AFYYLIKLKR 
IFAMVIKFQG 
AAYYFMKFRR 

3405 " ' 
YAWIWHIAYI 
YAWIWCAAYL 
YMWIWHLGFL 
YPGILDAGFI 
ISVIMHLQWL 
ISVIMHLQWL 
ISVVMHLQWL 
TVIIMHCWLV 
VSFLAHLQWF 



••••[..••| 

3355 
VFGDLS YGVF 
MFGDLSVGVC 
MFGDMSVGVF 
IFGDCTFLIV 
AFGDYTSWF 
AFGDYTSIVF 
AFGDYTSVW 
VFKAYATTVF 
VFGEYNHWA 

I I 

3415 
VAYFLLIPWW 
IAYISFAPWW 
ISYILIAPWW 
IAYINMAPWY 
VMYGTIMPLW 
VMYGTIMPLW 
VMYGAIMPLW 
FTFGLIVPTW 
AMFSPIVPFW 



I | . ... | .... I ....I. ..-I I • . • • I I I I 

3425 3435 3445 3455 3465 3475 

FMCR LLTWFSFAAF LELLPNVFKL K ISTQL FEGDKFIGTF ESAAAGTFVL DMRSYERLIN 

LCAWYFLAML TGLLPSLLKL K VSTNL FEGDKFVGTF ESAAAGTFVI DMRSYEKLAN 

PFDV VLMVYAFSAI FEFMPNLFKL K VSTQL FEGDKFVGSF ENAAAGTFVL DMHAYERLAN 

tppv VITAYILVFL YDSLPSLFKL K VSTNL FEGDKFVGNF ESAAMGTFVI DMRSYETIVN 

nr^ FCLLYIAWV SN — HAFWVF S YCRKL GTSVRSDGTF EEMALTTFMI TKDSYCKLKN 

R^CoV FCLLYISWV SN — HAFWVF S YCRQL GTSVRSDGTF EEMALTTFMI TKDSYCKLKN 

1 O MHV FCIIYVAWV SN — HALWLF S YCRKL GTEVRSDGTF EEMSLTTFMI TKESYCKLKN 

LV LACCYLGFII YMYTPLFLWC YGTTKNTRKL YDGNEFVGNY DLAAKSTFVI RGSEFVKLTN 



15 



SARS CoV ITAIYVFCIS LKHCHWFFNN Y— — LEKRV MFNGVTFSTF EEAALCTFLL NKEMYLKLRS 

. ... | .... | ....|....| ....|-...| . I ... - I . ... I .... I 

3485 3495 3505 3515 3525 3535 

PMCR T — ISPEKLK NYAASYNKYK YYSGSASEAD YRCACYAHLA KAMLDYAKDH N-DMLYSPPT 

„o» g — ISPEKLK SYAASYNRYK YYSGNANEAD YRCACYAYLA KAMLDFSRDH N-DILYTPPT 
S — ISTEKLR QYASTYNKYK YYSGSASEAD YRLACFAHLA KAMMDYASNH N-DTLYTPPT 

TPev g — TSXARIK SYANSFNKYK YYTGSMGEAD YRMACYAHLG KALMDYSVNR T-DMLYTPPT 

9 0 OC 4 3 s — LSDVAFN RYLSLYNKYR YYSGKMDTAA YREAACSQLA KAMDTFTNNN GSDVLYQPPT 

RoCoV S — LSDVAFN RYLSLYNKYR YYSGKMDTAA YREAACSQLA KAMDTFTNNN GSDVLYQPPT 

MHV S—VSDVAFN RYLSLYNKYR YFSGKMDTAA YREAACSQLA KAMET FNHNN GNDVLYQPPT 

AIPV E— I-GDKFE AYLSAYARLK YYSGTGSEQD YLQACRAWLA YALDQYR-N S GVEIVYTPPR 

SARS COV ETLLPLTQYN RYLALYNKYK YFSGALDTTS YREAACCHLA KALNDFS-NS GADVLYQPPQ 

25 | | | | I I I I I 1 I I 

3545 3555 3565 3575 3585 3595 

EMCR ISYN-STLQS GLKKMAQPSG CVERCWRVC YGSTVLNGVW LGDTVTCPRH VIAPS-TTVL 

ooqe VSYG-STLQA GLRKMAQPSG FVEKCWRVC YGNTVLNGLW LGDIVYCPRH VIASN-TTSA 

on PEDV VSYN-STLQA GLRKMAQPSG WEKCIVRVC YGNMALNGLW LGDIVMCPRH VIASS-TTST 

TGEV VSVN-STLQS GLRKMAQPSG LVEPCIVRVS YGNNVLNGLW LGDEVICPRH VIASD-TTRV 

OC43 ASVSTSFLQS GIVKMVNPTS KVEPCVVSVT YGNMTLNGLW LDDKVYCPRH VICSASDMTN 

RoCoV ASVSTSFLQS GIVKMVNPTS KVEPCIVSVT YGNMTLNGLW LDDKVYCPRH VICSASDMTN 

MHV ASVTTSFLQS GIVKMVFPTS KVEPCVVSVT YGNMTLNGLW LDDKVYCPRH VICSSADMTD 

n T pv YSIGVSRLQS GFKKLVSPSS AVEKCIVSVS YRGNNLNGLW LGDTIYCPRH VLG KFSG 

SARS CoV TSITSAVLQS GFRKMAFPSG KVEGCMVQVT CGTTTLNGLW LDDTVYCPRH VICTAEDMLN 

I 1 I 1 i t I I - • 1 

3605 3615 3625 3635 3645 3655 

ZLH EMCR IDYDHAYSTM RLHNFSVSHN G-VFLGVVGV TMHGSVLRIK VSQSNVHTPK HVFKTLKPGA 

229E IDYDHEYSIM RLHNFSIISG T— AFLGWGA TMHGVTLKIK VSQTNMHTPR HSFRTLKSGE 

PEDV IDYDYALSVL RLHNFSISSG N-VFLGVVSA TMRGALLQIK VNQNNVHTPK YTYRTVRPGE 

TGEV INYENEMSSV RLHNFSVSKN N-VFLGWSA RYKGVNLVLK VNQVNPNTPE HKFKS I KAGE 

OC43 PDYTNLLCRV TSSDFTVLFD R-LSLTVMSY QMRGCMLVLT VTLQNSRTPK YTFGVVKPGE 

4 S BoCoV PDYTNLLCRV TSSDFTVLFD R-LSLTVMSY QMQGCMLVLT VTLQNSRTPK YTFGVVKPGE 

* J PDYSNLLCRV ISSDFCVMSG R-MSLTVMSY QMQGSLLVLT VTLQNPNTPK YSFGWKPGE 

AIPV DQWNDVLNLA NNHEFEVTTQ HGVTLNVVSR RLKGAVLILQ TAVANAETPK YKFIKANCGD 

SARS COV PNYEDLLIRK SNHSFLVQAG N-VQLRVIGH SMQNCLLRLK VDTSNPKTPK YKFVRIQPGQ 

en . . . . | | l .... I ! I I I I 

JU 3665 3675 3685 3695 3705 3715 

EMCR SFNILACYEG IASGVFGVNL RTNFTIKGSF INGACGSPGY NVRNDGTVEF CYLHQIELGS 

229E GFNILACYDG CAQGVFGVNM RTNWTIRGSF INGACGSPGY NLKN-GEVEF VYMHQIELGS 

PEDV SFNILACYDG AAAGVYGVNM RSNYTIRGSF INGACGSPGY NINN-GTVEF CYLHQLELGS 

c c TGEV SFNILACYEG CPGSVYGVNM RSQGTIKGSF IAGTCGSVGY VLEN-GILYF VYMHHLELGN 

OC43 TFTVLAAYNG KPQGAFHVTM RSSYTIKGSF LCGSCGSVGY VIMG-DCVKF VYMHQLELST 

BoCoV TFTVLAAYNG KPQGAFHVTM RSSYTIKGSF LCGSCGSVGY VIMG-DCVKF VYMHQLELST 

MHV TFTVLAAYNG KSQGAFHVTM RSSYTIKGSF LCGSCGSVGY VLTG-DSVRF VYMHQLELST 

AIPV SFTIACAYGG TWGLYPVTM RSNGTIRASF LAGACGSVGF NIEK-GVVNF FYMHHLELPN 

60 SARS COV TFSVLACYNG SPSGVYQCAM RPNHTIKGSF LNGSCGSVGF NIDY-DCVSF CYMHHMELPT 

....|....| ....|. ...I . I .... I 

3725 3735 3745 3755 3765 3775 

EMCR GAHVGSDFTG SVYGNFDDQP SLQVESANLM LSDNVVAFLY AALLNGCR WWLRST 

£C 2 29E GSHVGSSFDG VMYGGFEDQP NLQVESANQM LTVNVVAFLY AAILNGCT WWLKGE 

PEDV GCHVGSDLDG VMYGGYEDQP TLQVEGASSL FTENVLAFLY AALINGST WWLSSS 

TGEV GSHVGSNFEG EMYGGYEDQP SMQLEGTNVM SSDNVVAFLY AALINGER WFVTNT 

oc43 GCHTGTDFNG DFYGPYKDAQ VVQLLIQDYI QSVNFVAWLY AAILNNCN WFVQSD 

BOCOV GCHTGTDFNG DFYGPYKDAQ VVQLPVQDYI QSVNFVAWLY AAILNNCN WFVQSD 

7 0 MHV GCHTGTDFSG NFYGPYRDAQ WQLPVQDYT QTVNWAWLY AAILNRCN WFVQSD 

AIPV ALHTGTDLMG EFYGGYVDEE VAQRVPPDNL VTNNIVAWLY AAIISVKESS FSLPKWLEST 

SARS COV GVHAGTDLEG KFYGPFVDRQ TAQAAGTDTT ITLNVLAWLY AAVINGDR WFLNRF 

. ... | .... I ....|. ...| ,...|....| • ... | .... | ....|....| 

7 c 3785 3795 3805 3815 3825 3835 

EMCR RVNVDGFNEW AMANGYTIVS SV~ ECYSIL AAKTGVSVEQ LLASIQHLHE -GFGGKNILG 

229E KLFVEHYNEW AQANGFTAMN GE—DAFSIL AAKTGVCVER LLHAIQVLNN -GFGGKQILG 

PEDV RIAVDRFNEW AVHNGMTTVG NT — DCFSIL AAKTGVDVQR LLASIQSLHK -NFGGKQILG 

TGEV SMSLESYNTW AKTNSFTELS ST--DAFSML AAKTGQSVEK LLDSIVRLNK -GFGGRTILS 

QO QC43 KCSVEDFNVW ALSNGFSQVK SD — LVIDAL ASMTGVSLET LLAAIKRLKN -GFQGRQIMG 
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PEDV 

TGEV 
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BoCoV 
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EMCR 

229E 

PEDV 

TGEV 

OC43 

BoCoV 

MHV 

AIPV 

SARS CoV 



EMCR 
229E 
PEDV 
TGEV 
0C43 
•BoCoV 
MHV 
AIPV 
SARS COV 



EMCR 

229E 

PEDV 

TGEV 

OC43 

BoCoV 

MHV 

AIPV 

SARS CoV 



EMCR 



KCSVEDFNVW ALSNGFSQVK SD-- LVIDAL ASMTGVSLET LLAAIKRLKN -GFQGRQIMG 
SCSLEEFNVW AMTNGPSSIK AD — LVLDAL ASMTGVTVEQ ILAAIKRLYS -GFQGKQILG 
TVSVDDYNKW AGDNGFTPFS TS-- TAITKL SAITGVDVCK LLRTIMVKNS -QWGGDPILG 
TTTLNDFNLV AMKYNYEPLT QDHVDILGPL SAQTGIAVLD MCAALKELLQ NGMNGRTILG 



■•••!•■••! 

3845 
YSSLCDEFTL 
YSSLNDEFSI 
HTSLTDEFTT 
YGSLCDEFTP 
SCSFEDELTP 
SCSFEDELTP 
SCVLEDELTP 
QYNFEDELTP 
STILE DEFTP 



| | 

3905 
NPVILTPXFC 
NPGFLTPFMI 
NPGYVTPMFA 
NPTFVSIVLA 
TTNMFSITFC 
TTNMLSITFC 
TTHMLGVTLC 
PLKFYVYAAV 
YENAFLPFTL 



3855 
AEVVKQMYGV 
NEVVKQMFGV 
GEVVRQMYGV 
TEVIRQMYGV 
SDVYQQLAGI 
SDVYQQLAGI 
SDVYQQLAGV 
ESVFNQIGGV 
FDVVRQCSGV 



• • •• I • . « • I 
3865 

NLQSGK V 

NLQSGK T 

NLQGGY V 

NLQAGK V 

KLQSKRTRLF 
KLQSKRTRLV 
KLQSKRTRW 
RLQSSFVR — 
TFQGKFKKIV 



....!....! 

3915 
LLLFLSLVLT 
LLVALSLCLT 
CLSLLSSLLM 
VTTLISTVFV 
ALCVIS-LAM 
ALCVIS-LAM 
ALCFVS-FAM 
ILLMAVLFIS 
GIMAIAACAM 



• • • • I I 

3925 
MFLKHKFLFL 
FWKHKVLFL 
FTLKHKTLFF 
SGIKHKMLFF 
LLVKHKHLYL 
LLVKHKHLYL 
LLVKHKHLYL 
FTVKHVMAYM 
LLVKHKHAFL 



••••{••••| 

3875 
IFGLKTMFLF 
TSMFKSISLF 
SRACRNVLLV 
KSFFYPIMTA 
KGTVCWIMAS 
KGIVCWIMAS 
KGTCCWILAS 
K — ATSWFWS 
KGTHHWMLLT 

I 



••••(••••| 

3885 
SVFFTMFWAE 
AGFFVMFWAE 
GSFLTFFWSE 
MTILFAFWLE 
TFLFSCIITA 
TFLFSCIITA 
TLLFCSIISA 
RCVLACFLFV 
FLTSLLILVQ 



■•••(••••| 

3895 
LFIYTNTIWI 
LFVYTTTIWV 
LVSYTKFFWV 
FFMYTPFTWI 
FVKWTMFMYV 
FVKWTMFMYV 
FVKWTMFMYV 
LCAIVLFTAV 
STQWSLFFFV 



3935 
QVFLLPTVIA 
QVFLLPSIIV 
QVFLIPALIV 
MSFVLPSVIL 
TMYITP-VLF 
TMYIIP-VLF 
TMFIMP-VLC 
DTFLLPTLIT 
CLFLLPSLAT 



3945 
TALYNC-VLD 
AAIQNC-AWD 
TSCINL-AFD 
VTAHNL-FWD 
TLLYNN-YLV 
TLLYNN-YLV 
TLFYTN-YLV 
VIIGVCAEVP 
VAYFN MV 



I | 

3955 
YYIVKFLADH 
YHVTKVLAEK 
VEVYNYLAEH 
FSYYESLQSI 
VYKHTFRGYV 
VYKQTFRGYV 
VYKQSFRGLA 
FIYNTLISQV 
YMPASWVMRI 



I | 

3965 
FN-YNVSVLQ 
FD-YNVSVMQ 
FD-YHVSLMG 
VENTNTMFLP 
YAWLS YYVPS 
YAWLSYYVPS 
YAWLSHFVPA 
VIFLSQWYDP 
MTWLELADTS 



| | 

3975 
MDVQGLVNVL 
MDIQGFVNIF 
FNAQGLVNIF 
VDMQGVMLTV 
VEYTYTDEVI 
VEYTYTDEVT 
VDYTYMDEVL 
WFDTMVPWM 
LSGYRLKDCV 



....|....| 

3985 
VCLFWFLH— 
I CLFVALLH - 
VCFWTILHG 
FCFIVFVTYS 
YGMLLLVGMV 
YGMLLLIGMV 
YGVVLLVAMV 
FLPLVLYTAF 
MYASALVLLI 



3995 
— TWRFSKER 
— TWRFAKER 
TYTWRFFN-T 
VRFFTCKQSW 
FVTLRSINHD 
FVTLRSINHD 
FVTMRSINHD 
KCVQGCYMNS 
LMTARTVYDD 



4005 
FTHWFTYVCS 
CTHWCTYLFS 
PASSVTYWA 
FSLAVTTILV 
LFSFIMFVGR 
LFSFIMFVGR 
VFSVMFLVGR 
FNTSLLMLYQ 
AARRVWTLMN 



I 



4015 
LIAVAYTYFY 
LIAVLYTALY 
LLTAAYNYFY 
IFNMVKI FGT 
LISVFSLWYK 
VISWSLWYM 
LVSLVSMWYF 
FVKLGFVIYT 
VITLVYKVYY 



• • • • I . . • . I ....|.... | 
4025 4035 

SGD --FLSL 

SYD YVSL 

ASD ILSC 

SDEPWTENQI AFCFVNM 

GSN LEEEI 

GSN LEEEI 

GAN LEEEV 

SSNTLTAYTE GNWELFFELV 
GNALD QAISM 



•...|....| 

4085 
GDVKLTLWY 
DGVKTVLLFY 
GDIKSVMFCY 
GFMKCISIVY 
PQIKIVLLCY 
PQIKIVLVCY 
PQVKLVLLSY 
NNYVLMAVMV 
NTLQCIMLVY 

I | 

- 4145 
PFDALWLSFK 
PFDALFLSFK 
TLDSLLLSAK 
AYDAMILSAK 
SFEALMLNFK 
S FEALMLNFK 
SFEALVLNFK 
WEVFSTNIL 
SIDAFKLNIK 



..-.|. ...| 

4095 
LICGYLVCTY 
MLLGFVSCMY 
LVLGYFTCCF 
MACGYLFCCY 
LFIGYIISCY 
LFIGYIISCY 
LCIGYVCCCY 
NCIGWLCTCY 
CFLGYCCCCY 

• ••'■(••••| 

-41-55 - - 
LLGIGGDRCI 
LMGIGGPRTI 
LIGIGGERNI 
LIGVGGKRNI 
LLGIGGVPII 
LLGIGGVPII 
LLGIGGVPVI 
IQGIGGDRVL 
LLGIGGKPCI 



4045 
LVMFLCAISS 
LVMLLCAISN 
AMTLFASVTG 
LTMIVSLTTK 
LLMLASLFGT 
LLMLASLFGT 
LLFLTS LFGT 
HTTVLANVSS 
WALVISVTSN 

• • ♦ • I • - • * I 



4105 
WGILYWFNRF 
YGLLYWINRF 
YGILYWFNRF 
YGILYWVNRF 
WGLFSLMNSL 
WGLFSLMNSL 
WGVLSLLNSI 
FGLYWWVNKV 
FGLFCLLNRY 

' 4165 - 1 
KISTVQSKLT 
KVSTVQSKLT 
KISSVQSKLT 
KISTVQSKLT 
EVSQFQSKLT 
EVSQFQSKLT 
EVSQIQSRLT 
PIATVQAKLS 
KVATVQSKMS 



I | 

4055 . 
DWYIGAIVFR 
EWYIGAIIFR 
NWFVGAVCYK 
DWMWIASYR 
YTWTTVLSMA 
YTWTTALSMA 
YTWTTMLSLA 
NSLIGLFVFK 
YSGVVTTIMF 

I. 



4065 
LSRLIIFFSP 
ICRFGVAFLP 
VAVYMALRFP 
IAYYIWCVM 
VAKVIAKWVA 
AAKVIAKWVA 
TAKVIAKWLA 
CAKWMLYYCN 
LARAIVFVCV 



I 



4075 

E SVFSVF 

V EYVSYF 

TFVAIF 

P-S-AFVSDF 
VNV-LYFTDI 
VNV-LYFTDI 
VNV-LYFTDV 

ATYL 

EYYPLLFITG 



4115 
FKCTMGV YD F 
CKCTLGVYDF 
FKVSVGVYDY 
TCMTCGVYQF 
FRMPLGVYNY 
FRMPLGVYNY 
FRMPLGVYNY 
FGLTLGKYNF 
FRLTLGVYDY 

•••«|«*«*| 

• " -4175" " 

DLKCTNVVLL 

DLKCTNVVLM 

DIKCSNWLL 

EMKCTNWLL 

DVKCANWLL 

DVKCANGGLL 

DVKCVNWLL 

DVKCTTVVLM 

DVKCTSWLL 



4125 
KVSAAEFKYM 
CVSPAEFKYM 
TVSAAEFKYM 
TVSAAELKYM 
KISVQELRYM 
KISVQELRYM 
KISVQELRYM 
KVSVDQYRYM 
LVSTQEFRYM 

• ] 

'4185 " — 
GCLSSMNIAA 
GILSNMNIAS 
GCLSSMNVSA 
GLLSKMHVES 
NCLQHLHVAS 
NCLQHLHVAS 
NCLQHLHIAS 
QLLTKLNVEA 
SVLQQLRVES 



I 



4135 
VANGLHAPYG 
VANGLNAPNG 
VANGLRAPTG 
TANNLSAPKN 
NANGLRPPKN 
NANGLRPPKN 
NANGLRPPRN 
CLHKINPPKT 
NSQGLLPPKS 

•■••!•.••] 

" 4195 

NSSEWAYCVD 

NSKEWAYCVE 

NSTEWAYCVD 

NSKEWNYCVG 

NSKLWHYCST 

NSKLWQYCST 

SSKLWQYCST 

NSKMHVYLVE 

SSKLWAQCVQ 



• • I I 

4205 



..I.... I ....|. ...| . — I .... I ....|.... | 

4215 4225 4235 4245 4255 

LHNKINLCDD PEKAQGMLLA LLAFFLSKHS DFG L DGLIDSYFDN SSTLQSVASS 



229E 
PBDV 
TGEV 
OC43 
5 BoCoV 
MHV 
AIPV 
SARS CoV 

10 

EMCR 

229E 

PEDV 
15 TGEV 

OC43 

BoCoV 

MHV 

AIPV 
20 SARS COV 
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EMCR 

229E 

PEDV 

TGEV 

OC43 

BoCoV 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OC43 

BoCoV 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OC43 

BOCOV 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OC43 

BoCoV 

MHV 

AIPV 

SARS CoV 



MHNKINLCDD 
LHNKINLCND 
LHNEINIiCDD 
LHNEILATSD 
LHNEILATSD 
LHNEILATSD 
LHNKILASDD 
LHNDILLAKD 



PETAQELLLA 
PEKAQEMLLA 
PEIVLEKLLA 
LSVAFEKLAQ 
LGVAFEKLAQ 
LSVAFDKLAQ 
VGECMDNLLG 
TTEAFEKMVS 



LLAFFLSKHS 
LLAFFLSKNS 
LIAFFLSKHN 
LLIVLFANPA 
LLIVLFANPA 
LLWLFANPA 
MLITLFCIDS 
LLSVLLSMQG 



4265 
FVSMPSYIAY 
FVGMPSFVAY 
YVGLPSYVIY 
YAALPSWIAL 
FVNMASFVEY 
FVNMASFVEY 
FVNMASFVEY 
FSHIPSYAEY 
FSSLPSYAAY 

....| | 



1 \ 

4275 
ENARQAYEDA 
ETARQEYENA 
ENARQQYEDA 
EKARADLEEA 
EVAKKNLDEA 
EVAKKNLDEA 
ELAKKNLDEA 
ERAKNLYEKV 
ATAQEAYEQA 



....!....! 
4285 

IANGSS 

VANGSS 

VNNGSP 

KKNDVS 

RFSGSAN 

CSSGSAN 

KASGSAN 

LVDSKNGGVT 
VANGDS 



DFG L 

AFG L 

TCD L 

AVDSKCLTSI 
AVDSKCLTSI 
AVDSKCLASI 

TID L 

AVD 1 

> 1 



GDLVDSYFEN 
DDLLESYFND 
SELIESYFEN 
EEVCDDYAKD 
EEVCDDYAKD 
EEVSDDYVRD 
SEYCDDILKR 
NRLCEEMLDN 



DSILQSVASS 
NSMLQSVAST 
TTILQSVASA 
NTVLQALQSE 
NTVLQALQSE 
STVLQALQSE 
STVLQSVTQE 
RATLQAIASE 



4295 
SQLIKQLKRA 
PQIIKQLKKA 
PQLVKQLRHA 
PQILKQLTKA 
QQQLKQLEKA 
QQQLKQLEKA 
QQQIKQLEKA 
QQELAAYRKA 
EWLKKLKKS 



I I 

4305 
MNIAKSEFDH 
MNVAKAEFDR 
MNVAKSEFDR 
FNIAKSDF5R 
CNIAKSAYER 
CNIAKSAYER 
CNIAKSAYER 
ANIAKSVFDR 
LNVAKSEFDR 



I I 

4315 
EISVQKKINR 
ESSVQKKINR 
EASTQRKLDR 
EASVQKKLDK 
DRAVAKKLER 
DRAVARKLER 
DRAVARKLER 
DLAVQKKLDS 
DAAMQRKLEK 



4325 
MAEQAATQMY 
MAEQAAAAMY 
MAEQAAAQMY 
MAEQAAASMY 
MADLALTNMY 
MADLALTNMY 
MADLALTNMY 
MAERAMTTMY 
MAD QAMTQMY 



I I 

4335 
KEARSVNRKS 
KEARAVNRKS 
KEARAVNRKS 
KEARAVDRKS 
KEARINDKKS 
KEARINDKKS 
KEARINDKKS 
KEARVTDRRA 
KQARSEDKRA 



. . - . I I 

4345 
KVISAMHSLL 
KVVSAMHSLL 
KWSAMHSLL 
KIVSAMHSLL 
KWSALQTML 
KWSALQTML 
KVVSALQTML 
KL VS S LH ALL 
KVTSAMQTML 



I I 

4355 
FGMLRRLDMS 
FGMLRRLDMS 
FGMLRRLDMS 
FGMLKKL DMS 
FSMVRKLDNQ 
FSMVRKLDNQ 
FSMIRKLDNQ 
FSMLKKIDSE 
FTMLRKLDND 



I 



|.. 

4365 
SVETVLNLAR 
SVDTILNMAR 
SVDTILNLAK 
SVNTIIDQAR 
ALNSILDNAV 
ALNSILDNAV 
ALNSILDNAV 
KLNVLFDQAS 
ALNNIINNAR 



I I 

4375 
DGWPLSVIP 
NGVVPLSVIP 
DGWPLSVIP 
NGVLPLSIIP 
KGCVPLNAIP 
KGCVPLNAIP 
KGCVPLNAIP 
SGWPLATVP 
DGCVPLNIIP 



4385 
ATSASKLTIV 
AT S AARLVW 
AVSATKLNIV 
AASATRLVVI 
SLAANTLNII 
SLAANTLTII 
SLTSNTLTII 
IVCSNKLTLV 
LTTAAKLMVV 

1 



I I • I • I I I . ... I - ... I I • I 

4395 4405 4415 4425 4435 

SPDLESYSKI VCDGSVHYAG WWTLNDVKD NDGRPVHVKE ITR EN 

VPDHDSFVKM MVDGFVHYAG VWTLQEVKD NDGKNVHLKD VTK EN 

TSDIDSYNRI QREGCVHYAG TI.WNIIDIKD NDGKWHVKE VTA QN 

TPSLEVFSKI RQENNVHYAG AIWTIVEVKD ANGS HVHLKE VTA AN 

VPDKSVYDQV VDNVYVTYAG NVWQIQTIQD SDGTNKQLNE IS 

VPDKSVYDQV VDNVYVTYAG NVWQIQTIQD SDGTNKQLHE IS — 

VPDKQVFDQV VDNVYVTYAG NVWHIQSIQD ADGAVKQLNE ID 

IPDPETWVKC VEGVHVTYST WWNIDTVID ADGTELHPTS TGSGLTYCIS 
VPDYGTYKNT CDGNTFTYAS ALWEIOQVVD ADSKIVQLSE INM-^ DN 



4445 
VETLTWPLIL 
QEILVWPLIL 
AESLSWPLVL 
ELNLTWPLSI 
-DDCNWPLVI 
-DDCNWPLVI 
-VNITWPLVI 
GAN I AWPLKV 
SPNLAWPLIV 



I I 

4455 

NCER 

TCER 

GCER 

TCER 

IANRY-NEVS 
IANRH-NEVS 
AANRH-NEVS 
NLTRNGHNKV 
TALRA-N — S 



| | 

4465 
WKLQNNEIM 
VVKLQNNEIM 
IVKLQNNEII 
TTKLQNNEIM 
ATVLQNNELM 
ATVLQNNELM 
SWLQNNELM 
DWLQNNELM 
AVKLQNNELS 



....|.. ..I 

4475 
PGKLKQKPMK 
PGKMKVKATK 
PGKLKQRSIK 
PGKLKERAVR 
PAKLKIQWN 
PAKLKTQWN 
PQKLRTQWN 
PHGVKTKACV 
PVALRQMSCA 



I I 

4485 
AEG — DGGVL 
GEG--DGGIT 
AEG — DG-IV 
ASATLDGEAF 
SGP — DQTCN 
SGP — DQTCN 
SGS — DMNCN 
AGVD-QAHCS 
AGTTQTACTD 



I I 

4495 
GDGNALYNTE 
SEGNALYNNE 
GEGKALYNNE 
GSGKALMASE 
TPTQCYYNNS 
TPTQCYYNNS 
TPTQCYYNTT 
VESKCYYTNI 
DNALAYYNNS 



....| | 

4505 
GGKTFMYAYI 
GGRAFMYAYV 
GGRTFMYAFI 
SGKSFMYAFI 
NNGKIVYAIL 
YNGKIVYAIL 
GMGKIVYAIL 
SGNSWAAIT 
KGGRFVLALL 

I 



I I 

4515 
SNKADLKFVK 
TTKPGMKYVK 
SDKPDLRWK 
ASDNNLKYVK 
SDVDGLKYTK 
SDVDGLKYTK 
SDCDGLKYTK 
SSNPNLKVAS 
SDHQDLKWAR 



1 |- 

4525 
WEY-EGG-CN 
WEH-DSG-W 
WEF-DGG-CN 
WES-NND-II 
I LKDDGN- FV 
ILKDDGN-FV 
IVKEDGN-CV 
FLNEAGN-QI 
FPKSDGTGTI 



.•..1 I 

4535 
TIELDSPCRF 
TVELEPPCRF 
TIELEPPRKF 
PIELEAPLRF 
VLELDPPCKF 
VLELDPPCKF 
VLELDPPCKF 
YVDLDPPCKF 
YTELEPPCRF 



| | 

4545 
MVETPNGPQV 
VIDTPTGPQI 
LVDSPNGAQI 
YVDGANGPEV 
TVQDAKGLKI 
TVQDVKGLKI 
SVQDVKGLKI 
GMKVGVKVEV 
VTDTPKGPKV 



I I 

4555 
KYLYFVKNLN 
KYLYFVKNLN 
KYLYFVRNLN 
KYLYFVKNLN 
KYLYFVKGCN 
KYLYFVKGCN 
KYLYFVKGCN 
VYLYFIKNTR 
KYLYFIKGLN 



4565 
TLRRGAVLGF 
NLRRGAVLGY 
TLRRGAVLGY 
TLRRGAVLGY 
TLARGWWGT 
TLARGWWGT 
TLARGWWGT 
SIVRGMVLGA 
NLNRGMVLGS 



...,|....| 

4575 
IGATIRLQAG 
IGATVRLQAG 
IGATVRLQAG 
IGATVRLQAG 
ISSTVRLQAG 
ISSTVRLQAG 
LSSTVRLQAG 
ISNVVVLQSK 
LAATVRLQAG 



I | 

4585 
-KQTELAVNS 
-KQTEFVSNS 
-KQTEQAINS 
-KPTEHPSNS 
-TATEYASNS 
-TATEYASNS 
-TATEYASNS 
GHETEEVDAV 
-NATEVPANS 



I | 

4595 
GLLTACAFSV 
HLLTHCSFAV 
SLLTLCAFAV 
SLLTLCAFSP 
SILSLCAFSV 
SILSLCAFSV 
AIRSLCAFSV 
GILSLCSFAV 
TVLSFCAFAV 



|. ...| 

4605 
DPATTYLEAV 
DPAAAYLDAV 
DPAKTYIDAV 
DPAKAYVDAV 
DPKKTYLDFI 
DPKKTYLDFI 
DPKKTYLDYI 
DPADTYCKYV 
DPAKAYKDYL 



....|....| 

4615 
KHGAKPVSNC 
KQGAKPVGNC 
KSGHKPVGNC 
KRGMQPVNNC 
QQGGTPIANC 
QQGGTPIANC 
QQGGAPVTNC 
AAGNQPLGNC 
ASGGQPITNC 
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EMCR 

229E 

PEDV 

TGEV 

OC43 

BoCoV 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OC43 

BoCoV 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OC43 

BoCoV 

MHV 

AIPV 

SARS CoV 



••••(••••| 

4625 
IKMLSNGAGN 
VKMLTNGSGS 
VKMLANGSGN 
VKMLSNGAGN 
VKMLCDHAGT 
VKMLCDHAGT 
VKMLCDHAGT 
VKMLTVHNGS 
VKMLCTHTGT 

• ■ • • | • • • • | 

4685 
QVPIGCL-DP 
QVPIGTN-DP 
QVPLGTV-DP 
QIPTGTQ-DP 
QVPVGIK-DP 
QVPVGIK-DP 
QVPLGIK-DP 
QIPTTEK-DP 
QIPTTCANDP 

|.. 

4745 
GVLVQLD 
GALVPLD 



I | 

4635 
GQAITTSVDA 
GQAITCTIDS 
GQAVTNGVEA 
GMAVTNGVEA 
GMAITVKPDA 
GMAITVKPDA 
GMAITIKPEA 
GPAITSKPSP 
GQAITVTPEA 

4695* 1 
IRFCLENNVC 
IRFCLENTVC 
IRFVLENDVC 
IRPCIENEVC 
VSYVLTHDVC 
VSYVLTHDVC 
VSYVLTHDVC 
VGFCLRNKVC 
VGFTLRNTVC 



• I .1 

4645 
NTNQDSYGGA 
NTTQDTYGGA 
STNQDSYGGA 
NTQQDSYGGA 
TTSQDSYGGA 
TTSQDSYGGA 
TTNQDSYGGA 
TPDQDSYGGA 
NMDQESFGGA 

I | 

4705 
NVCGCWLGHG 
KVCGCWLNHG 
KVCGCWLSNG 
VVCGCWLNNG 
RVCGFWRDGS 
QVCGFWRDGS 
QVCGPWRDGS 
TVCQCWIGYG 
TVCGMWKGYG 



I | 

4655 
SICLYCRAHV 
SVCIYCRAHV 
SVCLYCRAHV 
SVCIYCRCHV 
SVCIYCRARV 
SVCIYCRARV 
SVCIYCRSRV 
SVCLYCRAHI 
SCCLYCRCHI 



4665 

PHP SMD 

AHP TMD 

EHP SMD 

EHP AID 

EHP DVD 

EHP DVD 

EHP DVD 

AHPGSVGNLD 
DHP NPK 



4725 



47X5 
CACDRTTIQS 

CTCDRTAIQS 

CTCDRSIMQS 

CMCDRTSMQS F 

CSCVSTDTTV Q 

CSCVSTDTTV Q 

CSCVGTGSQF Q 

CQCDSLRQPK SSVQSVAGAS 
CSCDQLREPL M- 



I | 

4675 
GYCKFKGKCV 
GFCQYKGKWV 
GFCRLKGKYV 
GLCRYKGKFV 
GLCKLRGKFV 
GLCKLRGKFV 
GLCKLRGKFV 
GRCQFKGSFV 
GFCDLKGKYV 

I | 

4735 
-VDISYLNEQ 
-FDNSYLNES 



TVDQSYLNEC 

SKDT 

SKDTNFLNGF 
SKDTNFLNGF 
DFDKNYLNGY 
QSADASTFLN 



GVLVQLD 

GVRV 

GVQV 

GVAVRLG 
GFAV 



40 
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70 
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EMCR 

229E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

~ SARS " CoV 



EMCR 

229B 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS COV 



. . . . I | ....|. ...| ....|. ...| ....I...., 

5 15 25 35 45 

Z ZZ RARGSSAARL EPCN— GTDID KCVRAFDIYN 

EPCN-GTDID YCVRAFDVYN 

YGLFK RVRGSSAARL EPCN-GTDTQ HVYRAFDIYN 

EPCN-GTDPD HVSRAFDIYN 
FFKR VRGTSVDARL VPCASGLSTD VQLRAFDICN 

7ZZZZZZ FFKR VRGTSVDARL VPCASGLSTD VQLRAFDIYN 

LFLCRHRLPV SVKRHELFKR VRGTSVNARL VPCASGLDTD VQLRAFDICN 



I 



TPCGTGTSTD WYRAFDIYN 



55 

KNVS FLGKCL 
KDASFIGKNL 
KDVACLGKFL 
KDVACIGKFL 
ASVAGIGLHL 
ASVAGIGLHL 
ANRAGIGLYY 

MFQNL 

EKVAGFAKFL 



-.-.|. ...| 
65 

KMNCVRFKNA 
KSNCVRFKNV 
KVNCVRLKNL 
KTNCSRFRNL 
KVNCCRFQRV 
KVNCCRFQRV 
KVNCCRFQRA 
KRNCARFQEL 
KTNCCRFQEK 



75 



I 



•I -...I 
85 

DL KDGYFVIKRC 

DK DDAFYIVKRC 

DK HDAFYWKRC 

DK HDAYYIVKRC 

— DENG — DK LDQFFWKRT 
— DENG — DK LDQFFWKRT 
— DEDG — NT LDKFFVIKRT 
.RDTEDGNLEY. LDSYFWKQT- 
— DEEG — NL LDSYFVVKRH 



95 

TKSVMEHEQS 
I KS VMDHEQS 
TKSAMEHEQS 
TKTVMDHEQV 
DLTIYNREME 
DLTIYNREMK 
NLEVYNKEKE 
TPSNYEHEKS 
TMSNYQHEET 



105 
MYNLLNFSGA 
MYNLLKGCNA 
IYSRLEKCGA 
CYNDLKDSGA 
CYERVKDCKF 
CYERVKDCKF 
CYELTKECGV 
CYEDLKS-EV 
IYNLVKDCPA 



I | 

115 
LAEHD FFTWK 
VAKHD FFTWH 
IAEHDFFTWK 
VAEHDFFTYK 
VAEHDFFTFD 
VAEHDFFTFD 
VAEHEFFTFD 
TADHDFFVFN " 
VAVHDFFKFR 



....|....| 

125 
DGRVIYGNVS 
EGRTIYGNVS 
DGRAIYGNVC 
EGRCEFGNVA 
VEGSRVPHIV 
VEGSRVPHIV 
VEGSRVPHIV 

KN IYNIS 

VDGDMVPHIS 



• • - • I I 

135 
RHNLTKYTMM 
RQDLTKYTMM 
RKDLTEYTMM 
RRNLTKYTMM 
RKDLTKYTML 
RKDLTKYTML 
RKDLSKYTML 
RQRLTKYTMM 
RQRLTKYTMA 



• • . . 1 I 

145 
DLVYAMRNFD 
DLCFALRNFD 
DLCYALRNFD 
DLCYAIRNFD 
DLCYALRHFD 
DLCYALRHFD 
DLCYALRHFD 
DFCYALRHFD 
DLVYALRHFD 



155 
EQNCDVLKEV 
EKDCEVFKEI 
ENNCDVLKSI 
EKNCEVLKEI 
RNDCMLLCDI 
RNDCMLLCDI 
RNDCSTLKEI 
PKDCEVLKEI 
EGNCDTLKEI 



165 
LVLTGCCDNS 
LVLTGCCSTD 
LIKVGACEES 
LVTVGACTEE 
LSIYAGCEQS 
LSIYAGCEQS 
LLTYAECDES 
LVTYGCIEDY 
LVTYNCCDDD 



I | 

175 

YFDSKG 

YFBMKN 

YFNNKV 

FFENKD 

YFTKKD 

YFTKKD 

YFQKKD 

HPKWFEENKD 
YFNKKD 



I | * ... I .... I ....|....| . ... | .... I ..| 

185 195 205 215 225 235 

EMCR WYDPVENEDI HRVYASLGKI VARAMLKCVA LCDAMVAKGV VGVLTLDNQD LNGNFYDFGD 

5 229E WFDPIENEDI HRVYAALGKV VANAMLKCVA FCDEMVLKGV VGVLTLDNQD LNGNFYDFGD 

p EDV WFDPVENEDI HRVYALLGTI VARAMLKCVK FCDAMVEQGI VGVVTLDNQD LNGDFYDFGD 

TGEV WFDPVENEAI HEVYAKLGPI VANAMLKCVA FCDAIVEKGY IGVITLDNQD LNGNFYDFGD 

BoCoV WYDFVENPDI INVYKKLGPI FNRALVSATE FADKLVEVGL VGILTLDNQD LNGKWYDFGD 

oc43 WYDFVENPDI INVYKKLGPI FNRALVSATE FADKLVEVGL VGVLTLDNQD LNGKWYDFGD 

1Q MHV WYDFVENSDI INVYKKLGPI FN RALLNT AK FADTLVEAGL VGVLTLDNQD LYGQWYDFGD 

AIPV WYDPIENSKY YVMLAKMGPI VRRALLNAIE FGNLMVEKGY VGVITLDNQD LNGKFYDFGD 

SARS CoV WYDFVENPDI LRVYANLGER VRQSLLKTVQ FCDAMRDAGI VGVLTLDNQD LNGNWYDFGD 

. ... I .... I ....|....| ,...|....| 

T 5 245 255 265 275 285 295 

EMCR FVVSLPNMGV PCCTSYYSYM MPIMGLTNCL ASECFVKSDI FGSDFKTFDL LKYDFTEHKE 

22 9E FVLCPPGMGI PYCTSYYSYM MPVMGMTNCL ASECFMKSDI FGQDFKTFDL LKYDFTEHKE 

PEDV FTCSIKGMGV PICTSYYSYM MPVMGMTNCL ASECFVKSDI FGEDFKSYDL LEYDFTEHKT 

TGEV FVKTAPGFGC ACVTSYYSYM MPLMGMTSCL ESENFVKSDI YGSDYKQYDL LAYDFTEHKE 

20 BOCOV YVIAAPGCGV AIADSYYSYM MPMLTMCHAL DCELYVNNAY R LFDL VQYDFTDYKL 

OC43 YVIAAPGCGV AIADSYYSYI MPMLTMCHAL DCELYVNNAY R LFDL VQYDFTDYKL 

MHV FVKTVPGCGV AVADSYYSYM MPMLTMCHAL DSELFINGTY R EFDL VQYDFTDFKL 

AIPV FQKTAPGAGV PVFDTYYSYM MPI I AMTDAL APERY FE YDV HKG-YKSYDL LKYDYTEEKQ 

SARS CoV FVQVAPGCGV PIVDSYYSLL MPILTLTRAL AAESHMDADL AKP-LIKWDL LKYDFTEERL 

25 

I . . . . I I ....I.... | . ... I .... i 

305 315 325 335 345 355 

EMCR NLFNKYFKHW SFDYHPNCSD CYDDMCVIHC ANFNTLFATT IPGTAFGPLC RKVFIDGVPL 

229E VLFNKYFKYW GQDYHPDCVD CHDEMCILHC SNFNTLFATT IPNTAFGPLC RKVFIDGVPV 

30 PEDV ALFNKYFKYW GLQYHPNCVD CSDEQCIVHC ANFNTLFSTT IPITAFGPLC RKCWIDGVPL 

TGEV YLFQKYFKYW DRTYHPNCSD CTSDECIIHC ANFNTLFSMT IPMTAFGPLV RKVHIDGVPV 

BoCoV ELFNKYFKHW SMPYHPNTVD CQDDRCIIHC ANFNILFSMV LPNTCFGPLV RQIFVDGVPF 

OC43 ELFNKYFKHW SMPYHPNTVD CQDDRCIIHC ANFNILFSMV LPNTCFGPLV RQIFVDGVPF 

MHV ELFNKYFKYW SMTYHPNTCE CEDDRCIIHC ANFNILFSMV LPKTCFGPLV RQIFVDGVPF 

35 AIPV ELFQKYFKYW DQEYHPNCRD CSDDRCLIHC ANFNILFSTL IPQTSFGNLC RKVFVDGVPF 

SARS CoV CLFDRYFKYW DQTYHPNCIN CLDDRCILHC ANFNVLFSTV FPPTSFGPLV RKIFVDGVPF 

I..., I . ... I .... I I I '....I I --.'-I I I I 

365 375 385 . .395- 405 . 415 

40 EMCR VTTAGYHFKQ LGLVWNKDVN THSVRLTITE LLQFVTDPSL IIASSPALVD QRTICFSVAA 

22 9E VATAGYHFKQ LGLVWNKDVN THSTRLTITE LLQFVTDPTL IVASSPALVD KRTVCFSVAA 

PEDV VTTAGYHFKQ LGIVWNNDLN LHSSRLSINE LLQFCSDPAL LIASSPALVD QRTVCFSVAA 

TGEV WTAGYHFKQ LGIVWNLDVK LDTMKLSMTD LLRFVTDPTL LVASSPALLD QRTVCFSIAA 

*" BoCoV WSIGYHYKE LGIVMNMDVD THRYRLSLKD LLLYAADPAL HVASASALYD LRTCCFSVAA 

45 OC43 WSIGYHYKE LGIVMNMDVD THRYRLSLKD LLLYAADPAL HVASASALYD LRTCCFSVAA 

MHV WSIGYHYKE LGWMNMDVD THRYRLSLKD LLLYAADPAL HVASASALLD LRTCCFSVAA 

AIPV IATCGYHSKE LGVIMNQDNT MSFSKMGLSQ LMQFVGDPAL LVGTSNNLVD LRTSCFSVCA 

SARS CoV VVSTGYHFRE LGWHNQDVN LHSSRLSFKE LLVYAADPAM HAASGNLLLD KRTTCFSVAA 

50 ' ....|. ...I ....I | 

425 435 445 455 465 475 

EMCR LSTGLTNQW KPGHFNEEFY NFLRLRGFFD EGSELTLKHF FFAQNGDAAV KDFDFYRYNK 

22 9E LSTGLTSQTV KPGHFNKEFY DFLRSQGFFD EGSELTLKHF FFTQKGDAAI KDFDYYRYNR 

PEDV LGTGMTNQTV KPGHFNKEFY DFLLEQGFFS EGSELTLKHF FFAQKVDAAV KDFDYYRYNR 

55 TGEV LSTGITYQTV KPGHFNKDFY DFITERGFFE EGSELTLKHF FFAQGGEAAM TDFNYYRYNR 

BOCOV ITSGVKFQTV KPGNFNQDFY DFILSKGLLK EGSSVDLKHF FFTQDGNAAI TDYNYYKYNL 

OC43 ITSGVKFQTV KPGNFNQDFY DFVLSKGLLK EGSSVDLKHF FFTQDGNAAI TDYNYYKYNL 

MHV ITSGVKFQTV KPGNFNQDFY EFILSKGLLK EGSSVDLKHF FFTQDGNAAI TDYNYYKYNL 

AIPV LTSGITHQTV KPGHFNKDFY DFAEKAGMFK EGSSIPLKHF FYPQTGNAAI NDYDYYRYNR 

60 SARS CoV LTNNVAFQTV KPGNFNKDFY DFAVS KGFFK EGSSVELKHF FFAQDGNAAI SDYDYYRYNL 

....|....| — . I — . I ....!..-.! I I ....I I 

485 495 505 515 525 535 

EMCR PTILDICQAR VTYKIVSRYF DIYEGGCIKA CEVWTNLNK SAGWPLNKFG KASLYYESIS 

65 229E PTMLDIGQAR VAYQVAARYF DCYEGGCITS REVVVTNLNK SAGWPLNKFG KAGLYYESIS 

PEDV PTVLDICQAR WYQIVQRYF DIYEGGCITA KEWVTNLNK SAGYPLNKFG KAGLYYESLS 

TGEV VTVLDICQAQ FVYKIVGKYF ECYDGGCINA REVVVTNYDK SAGYPLNKFG KARLYYETLS 

BOCOV PTMVDIKQLL FVLEVVYKYF EIYDGGCIPA AQVIVNNYDK SAGYPFNKFG KARLYYEALS 

OC43 PTMVDIKQLL FVLEVVYKYF EIYDGGCIPA SQVIVNNYDK SAGYPFNKFG KARLYYEALS 

70 MHV PTMVDIKQLL FVLEVVNKYF EIYDGGCIPA TQVIVNNYDK SAGYPFNKFG KARLYYEALS 

AIPV PTMFDICQLL FCLEVTSKYF ECYEGGCIPA SQVWNNLDK SAGYPFNKFG KARLYYE-MS 

SARS CoV PTMCDIRQLL FVVEVVDKYF DCYDGGCINA NQVIVNNXDK SAGFPFNKWG KARLYYDSMS 

....|....| ....|....| ....|....| ...,| | ....|....| ....|....| 

75 545 555 565 575 585 595 

EMCR YEEQDALFAL TKRNVLPTMT QLNLKYAISG KERARTVGGV SLLSTMTTRQ YHQKHLKSIV 

229E YEEQDAIFSL TKRNILPTMT QLNLKYAISG KERARTVGGV SLLATMTTRQ FHQKCLKSIV 

PEDV YEEQDELYAY TKRNILPTMT QLNLKYAISG KERARTVGGV SLLSTMTTRQ YHQKHLKSIV 

TGEV YEEQDALFAL TKRNVLPTMT QMNLKYAISG KARARTVGGV SLLSTMTTRQ YHQKHLKSIA 

80 BoCoV FEEQDEIYAY TKRNVLPTLT QMNLKYAISA KNRARTVAGV SILSTMTGRM FHQKCLKSIA 
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80 



OC43 
MHV 
AIPV 
SARS CoV 
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229E 
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BoCoV 

OC43 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

BOCOV 

OC43 

MHV 

AIPV 

SARS COV 



EMCR 

229E 

PEDV 

TGEV 

BOCOV 

OC43 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS CoV 



EMCR 



FEEQDEIYAY TKRNVLPTLT QMNLKYAISA KNRARTVAGV SILSTMTGRM FHQKCLKSIA 
FEEQDEVYAY TKRNVLPTLT QMNLKYAISA KNRARTVAGV SILSTMTGRM FHQKCLKSIA 
LEEQDQLFEI TKKNVLPTIT QMNLKYAISA KNRARTVAGV SILSTMTNRQ FHQKILKSIV 
YEDQDALFAY TKRNVIPTIT QMNLKYAISA KNRARTVAGV SICSTMTNRQ- FHQKLLKSIA 



....|.... | 

605 
NTRNATVVIG 
ATRNATVVIG 
NTRGAS V VI G 
ATRNATVVIG 
ATRGVPWIG 
ATRGVPWIG 
ATRGVPWIG 
NTRNASVVIG 
ATRGATWIG 

I | 

665 
KHVNCCTVTD 
KHVTCCTASD 
KHTTCCSSTD 
KHVGCCTHND 
KH EACCSQS D 
KHETCCSQSD 
KHDSCCSHTD 
KHTNCCSWSE 
KHNTCCNLSH 

| | 

725 
SSNINRLLSV 
SSNINCVLSV 
SANVNKLLSV 
SANVNKLLGV 
SANVCALMSC 
SANVCALMSC 
SANVCSLMAC 
SANVARLLSV 
TANVNALLST 



.•..|.. ..| 

615 
TTKFYGGWNN 
TTKFYGGWDN 
TTKFYGGWDN 
STKFYGGWDN 
TTKFYGGWDD 
TTKFYGGWDD 
TTKFYGGWDD 
TTKFYGGWDN 
TSKFYGGWHN 

I.. ..I 

675 
RFYRLGNELA 
KFYRLSNELA 
RFFRLCNELA 
RFYRLSNELA 
RFYRLANECA 
RFYRLANECA 
RFYRLANECA 
RIYRLYNECA 
RFYRLANECA 

••••I » • • » I 

735 
PSDSCNNVNV 
NSSNCNNFNV 
DSNVCHNLEV 
DSNACNNVTV 
NGNKIEDLSI 
NGNKIEDLSI 
NGHKIEDLSI 
ITRDIVYDNI 
DGNKIADKYV 



625 
MLRTLIDGVE 
MLKNLMADVD 
MLKNLIDGVB 
MLKNLMRDVD 
MLRRLIKDVD 
MLRRLIKDVD 
MLRRLIKDVD 
MLRNLIQGVE 
MLKTVYSDVE 

....!..,.! 

685 
QVLTEWYSN 
QVLTEVVYSN 
QVLTEWYSN 
QVLTEWHCT 
QVLSEIVMCG 
QVLSEIVMCG 
QVLSEIVMCG 
QVLSETVLAT 
QVLSEMVMCG 



....|. ...| 

635 
NPMLMGWDYP 
DPKLMGWDYP 
NPCLMGWDYP 
NGCLMGWDYP 
NPVLMGWDYP 
NPVLMGWDYP 
SPVLMGWDYP 
DPILMGWDYP 
TPHLMGWDYP 

I | 

695 
GGFYFKPGGT 
GGFYFKPGGT 
GGFYLKPGGT 
GGFYFKPGGT 
GCYYVKPGGT 
GCYYVKPGGT 
GCYYVKPGGT 
GGIYVKPGGT 
GSLYVKPGGT 



645 
KCDRALPNMI 
KCDRAMPSMI 
KCDRALPNMI 
KCDRALPNMI 
KCDRAMPNIL 
KCDRAMPNLL 
KCDRAMPNIL 
KCDRAMPNLL 
KCDRAMPNML 

• I 

705 
TSGDASTAYA 
TSGDATTAYA 
TSGDATTAYA 
TSGDGTTAYA 
SSGDATTAFA 
SSGDATTAFA 
SSGDATTAFA 
SSGDATTAYA 
SSGDATTAYA 



....|....| 

655 
RMISAMVLGS 
RMLSAMILGS 
RMISAMILGS 
RMASAMILGS 
RIVSSLVLAR 
RIVSSLVLAR 
RIISSLVLAR 
RIAASLVLAR 
RIMASLVLAR 

•••»!•••»! 

715 
NSIFNIFQAV 
NSVFNIFQAV 
NSVFNIFQAV 
NSAFNIFQAV 
NSVFNICQAV 
NSVFNICQAV 
NSVFNICQAV 
NSVFNIIQAT 
NSVFNICQAV 



.-..|.. ..| 

785 
DGWCYNKDY 
DSVVCYNKTY 
DGWCYNNDY 
DGWCYNKDY 
DGVVCYNSDY 
DGWCYNSDY 
DGVVCYNSEF 
DGWCYNNTL 
DAWCYNSNY 

• • « »(••»»( 

845 
VDKDGTYYLP 
VDENGKYYLP 
VDKEGTYYLP 
VGPDGDYYLP 
KMDGDDVYLP 
KMDGDDVYLP 
KMDGDEVYLP 
EVDGEPKYLP 
KQGDDYVYLP 



I | 

9.05. _ ... 

FYVLLDWVKH 
FYALLDWVKH 
FYVLLDWVKH 
FYTLLDWVKH 
FRVYLEYIKK 
FRVYLAYIKK 
FRVYLEYIKK 
FFVLLAYIRK 
FHLYLQYIRK 



I | 

795 
AELGYIADIS 
AGLGYIADIS 
AS LG YVADLN 
ADLGYVADIN 
AS KG YI AN IS 
ASKGYIANIS 
AS KG Y I AN IS 
AKQGLVADIS 
AAQGLVASIK 

'"855*" 1 
YPDPSRILSA 
YPDPSRIISA 
YPDPSRILSA 
YPDPSRILSA 
YPVPSRILGA 
YPNPSRILGA 
YPDPSRILGA 
YPDPSRILGA 
YPDPSRILGA 

I | 

„ . 915- - - - 
LNKNLNEGVL 
LNKTLNEGVL 
LYKTLNAGVL 
LQKNLNAGVL 
LYNELGNQIL 
LYNDLGNQIL 
LYNDLGNQIL 
LYQELSQNML 
LHDELTGHML 



) | 

745 
RDLQRRLYDN 
KKLQRQLYDN 
KQLQRKLYEC 
KSIQRKIYDN 
RALQKRLYSH 
RALQKRLYSH 
RELQKRLYSN 
KSLQYELYQQ 
RNLQHRLYEC 

I 



>•••)..,. | 

755 
CYRLTSVEES 
CYRNSNVDES 
CYRSTIVDDQ 
CYRSSSIDEE 
VYRSDMVDST 
VYRSDKVDST 
VYRADHVDPA 
VYRRVNFDPA 
LYRNRDVDHE 



I | 

765 
FIDDYYGYLR 
FVDDFYGYLQ 
FVVEYYGYLR 
FVVEYFSYLR 
FVTEYYEFLN 
FVTEYYEFLN 
FVNEYYEFLN 
FVEKFYSYLC 
FVDEFYAYLR 



....i — i 

775 
KHFSMMILSD 
KHFSMMILSD 
KHFSMMILSD 
KHFSMMILSD 
KHFSMMILSD 
KHFSMMILSD 
KHFSMMILSD 
KNFSLMILSD 
KHFSMMILSD 



805 
AFKATLYYQN 
AFKATLYYQN 
AFKAVLYYQN 
AFKATLYYQN 
AFQQVLYYQN 
AFQQVLYYQN 
AFQQVLYYQN 
GFREVLYYQN 
NFKAVLYYQN 

| • • • . | 
865 
GVFVDDVVKT 
GVFVDDITKT 
GVFVDDVVKT 
GVFVDDIVKT 
GCFVDDLLKT 
GCFVDDLLKT 
GCFVDDLLKT 
CVFVDDVDKT 
GCFVDDIVKT 

I | 

925 
ESFSVTLLDN 
ESFSVTLLDE 
ESFSVTLLED 
DSFSVTMLEE 
DSYSVILSTC 
DSYSVILSTC 
DSYSVILSTC 
MDYSFVMDID 
DMYSVMLTND 



815 
NVFMSTSKCW 
GVFMSTAKCW 
NVFMSASKCW 
NVFMSTSKCW 
NVFMSESKCW 
NVFMSESKCW 
NVFMSEAKCW 
NVFMADSKCW 
NVFMSEAKCW 

• ... J .... I 

875 
DAWLLXRYV 
DAVILLERYV 
DAWLLERYV 
DNVIMLERYV 
DSVLLIERFV 
DSVLLIERFV 
DSVLLIERFV 
EPVAVMERYI 
DGTLMIERFV 



825 • 
VEEDLTKGPH 
TEEDLSIGPH 
IEPDINKGPH 
VEPDLSVGPH 
VENDINNGPH 
VEHDINNGPH 
VETDIEKGPH 
VEPDLBKGPH 
TETDLTKGPH 

I 

885 
S LA I DAY PL S 
SLAIDAYPLS 
SLAIDAYPLS 
SLAIDAYPLT 
SLAI DAYPLV 
SLAIDAYPLV 
SLAI DAYPLV 
ALA I DAYPLV 
SLAIDAYPLT 



I 



835 
EFCSQHTMQI 
EFCSQHTMQI 
EFCSQHTMQI 
EFCSQHTLQI 
EFCSQHTMLV 
EFCSQHTMLV 
EFCSQHTMLV 
EFCSQHTMLV 
EFCSQHTMLV 

* • • • I I 

895 
KHPNSEYRKV 
KHPBCPEYRKV 
KHENPEYKKV 
KHPKPAYQKV 
YHENEEYQKV 
YHENEEYQKV 
YHENPEYQNV 
HHENEEYKKV 
KHPNQEYADV 



.1 



- 935 
QEDKFWCEDF 
HESKFWDESF 
STAKFWDESF 
GQDKFWSEEF 
DGQKFTDESF 
DGQKFTDESF 
DGQKFTDETF 
KGSKFWEQEF 
NTSRYWEPEF 



945 ~ 
YASMYENSTI 
YASMYEKSTV 
YANMYEKSAV 
YASLYEKSTV 
YKNMYLRSAV 
YKNMYLRSAV 
YKNMYLRSAV 
YENMYRAPTT 
YEAMYTPHTV 



I 



|. 

955 
LQAAGLCVVC 
LQAAGLCWC 
LQSAGLCWC 
LQAAGMCVVC 
MQSVGACVVC 
MQSVGACWC 
MQSVGACVVC 
LQSCGVCWC 
LQAVGACVLC 



..|.. 
965 



I 



975 



I 
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985 



GSQTVLRCGD CLRKPMLCTK CAYDHVFGTD HKFILAITPY VCNASGCGVS DVKKLYLGGL 



22 9E 
PEDV 
TGEV 
BoCoV 
5 OC43 
MHV 
AXPV 
SARS CoV 
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EMCR 

229E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS GOV 



EMCR 

229E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

BoCoV 

OC4 3 

MHV 

AIPV 

SARS CoV 



GSQTVLRCGD 
GSQTVLRCGD 
GSQTVLRCGD 
SSQTSLRCGS 
SSQTSLRCGS 
SSQTSLRCGS 
NSQTILRCGN 
NSQTSLRCGA 

I 



CLRRPMLCTK 
CLRRPMLCTK 
CLRRPLLCTK 
CIRKPLLCCK 
CIRKPLLCCK 
CIRKPLLCCK 
CIRKPFLCCK 
CIRRPFLCCK 



CAYDHVFGTD 
CAYDHVIGTT 
CAYDHVMGTK 
CCYDHVMATD 
CCYDHVMATD 
CAYDHVMSTD 
CCYDHVMHTD 
CCYDHVISTS 



HKFILAITPY 
HKFILAITPY 
HKFIMSITPY 
HKYVLSVSPY 
HKYVLSVSPY 
HKYVLSVSPY 
HKNVLSINPY 
HKLVLSVNPY 



1025 
NYYCTNHKPQ 
NYYCVDHKPH 
SYWCHEHKPR 
SYYCMNHKPQ 
SYYCEDHKPQ 
SYYCEDHKPQ 
SYYCEDHKPQ 
SYFCGNHKPK 
SYYCKSHKPP 





1035 
LSFPLCSAGN 
LSFPLCSAGN 
LAFPLCSAGN 
LSFPLCANGN 
YSFKLVMNGM 
YSFKLVMNGL 
YSFKLVMNGM 
LSIPLVSNGT 
ISFPLCANGQ 



....| ....I 

1045 
IFGLYKNSAT 
VFGLYKSSAL 
VFGLYKNSAT 
VFGLYKSSAV 
VFGLYKQSCT 
VFGLYKQSCT 
VFGLYKQSCT 
VFGIYRANCA 
VFGLYKNTCV 



....I.. ..I 

1085 
TLRLFAAETI 
SLRL FAAETV 
SLRLFAAETI 
SLKI FAAETV 
RLKLFAAETQ 
RLKLFAAETQ 
RLKLFAAETQ 
SLRRFAAETV 
RLKLFAAETL 

1 1 



....|.... I 

1095 
KAKEESVKSS 
KAKEESVKSS 
KAKEESVKSS 
KAKEESVKSE 
KATEEAFKQS 
KATEEAFKQS 
KATEESFKQC 
KATEELHKQQ 
KATEETFKLS 



1 



1145 
KDSKFQIGEF 
KDSKFQVGEF 
KNTKFQIGEF 
KDTKIQLGEF 
KNGKTVLGEY 
KNGKTVLGEY 
SNGKTVLGEY 
RTSKVQLGDF 
KNSKVQIGEY 



I . - 

1155 
IFEKVEYGSD 
VFEKVDYGSD 
VFEKAEYDND 
VFEQSEYGSD 
VFDKSEL-TN 
VFDKSEL-TN 
VFDKSEL-TN 
TFEKGEG-KD 
TFEKGDY-GD 





1105 
YAFATLKEW 
YAYATLKE I V 
YACATLHEVV 
YAYAVLKEVI 
YASATIQEIV 
YASATIQEIV 
YASATIREIV 
FASAEVREVF 
YGIATVREVL 

I 



..,.1 I 

1055 
GSLDVEVFNR 
GSMDIDVFNK 
GSPDVEDFNR 
GSEAVEDFNK 
GSPYIDDFNR 
GSPYIDDFNR 
GSPYIEDFNK 
GSENVDDFNQ 
GSDNVTDFNA 

1. 



VCNTSGCNVN 
VCCASDCGVN 
VCSFNGCNVN 
VCNAPGCDVN 
VCNAPGCDVN 
VCNSPGCDVN 
ICSQLGCGEA 
VCNAPGCDVT 

I 



DVTKLYLGGL 
DVTKLYLGGL 
DVTKLFLGGL 
DVTKLYLGGM 
DVTKLYLGGM 
DVTKLYLGGM 
DVTKLYLGGM 
DVTQLYLGGM 



1065 
LATSDWTDVR 
LSTSDWSDIR 
IATSDWTDVS 
LAVSDWTNVE 
IASCKWTDVD 
IASCKWTDVD 
IASCKWTEVD 
LATTNWSIVE 
IATCDWTNAG 



I | 

1075 
DYKLANDVKD 
DYKLANDAKE 
DYRLANDVKD 
DYKLANNVKE 
DYILANECTE 
DYILANECTE 
DYVLANECTE 
PYILANRCSD 
DYILANTCTE 



1115 
GPKELLLSWE 
GPKELLLLWE 
GPKELLLKWE 
GPKEIVLQWE 
SERELILSWE 
SERELILSWE 
SDRELILSWE 
SDRELILSWE 
SDRELHLSWE 



1165 
TVTYKSTVTT 
TVTYKSTATT 
AVTYKTTATT 
SVYYKSTSTY 
GVYYRATTTY 
GVYYRATTTY 
GVYYRATTTY 
WYYKATSTA 
AVVYRGTTTY 



1 1 

1175 
KLVPGMIFVL 
KLVPGMLFIL 
KLVPGMVFVL 
KLTPGMIFVL 
KLSVGDVFVL 
KLSVGDVFVL 
KLSVGDVFII* 
KLSVGDIFVI. 
KLNVGDYFVL 



I | 

1125 
SGKVKP PLNR 
SGKAKPPLNR 
VGRPKPPLNR 
ASKTKPPLNR 
IGKVKPPLNK 
IGKVKPPLNK 
IGKVRPPLNK 
PGKTRPPLNR 
VGKPRPPLNR 

I | 



I | 

1135 
NSVFTCFQIS 
NSVFTCFQIT 
NSVFTCYHIT 
NSVFTCFQIS 
NYVFTGYHFT 
NYVFTGYHFT 
NYVFTGYHFT 
NYVFTGYHFT 
NYVFTGYRVT 



1185 
TSHNVQPLRA 
TSHNVAPLRA 
TSHNVQPLRA 
TSHNVSPLKA 
TSHSVANLSA 
TSHSVANLSA 
TSHAVSSLSA 
TSHNWSLVA 
TSHTVMPLSA 



I I 

1195 
PTIANQEKYS 
PTMANQEKYS 
PTIANQERYS 
PILVNQEKYN 
PTLVPQENYS 
PTLVPQENYS 
PTLVPQENYT 
PTLCPQQTFS 
PTLVPQEHYV 



....|....| 

1205 
SIYKLHPAFN 
TIYKLHPSFN 
TIHKLHPAFN 
TISKLYPVFN 
SIR-FASVYS 
SIR-FASVYS 
SIR-FASVYS 
RFVNLRPNVM 
RITGLYPTLN 



i I 

1215 
VSDAYANLVP 
VSDAYANLVP 
IPEAYSSLVP 
IAEAYNTLVP 
VLETFQNNW 
VLETFQNNVV 
VPETFQNNVP 
VPECFVNNIP 
ISDEFSSNVA 



I I 

1225 
YYQLIGKQKI 
YYQLIGKQRI 
YYQLIGKQKI 
YYQMIGKQKF 
NYQHIGMKRY 
NYQHIGMKRY 
NYQHIGMKRY 
LYHLVGKQKR 
NYQKVGMQKY 



....!-. 

1235 
TTIQGPPGSG 
TTIQGPPGSG 
TTIQGPPGSG 
TTIQGPPGSG 
CTVQGPPGTG 
CTVQGPPGTG 
CTVQGPPGTG 
TTVQGPPGSG 
STLQGPPGTG 



....|....| 

1245 
KSHCSIGLGL 
KSHCSIGIGV 
KSHCVIGLGL 
KSHCVIGLGL 
KSHLAIGLAV 
KSHLAIGLAV 
KSHLAIGLAV 
KSHFAIGLAV 
KSHFAIGLAL 



! | 

1255 
YYPGARIVFV 
YYPGARIVFT 
YYPGARIVFT 
YYPQARIVYT 
YYCTARVVYT 
FYCTARVVYT 
YYCTARVVYT 
YFSSARWFT 
YYPSARIVYT 



I I 

1265 
ACAHAAVDSL 
ACSHAAVDSL 
ACSHAAVDSL 
ACSHAAVDAL 
AASHAAVDAL 
AASHAAVDAL 
AASHAAVDAL 
ACSHAAVDAL 
ACSHAAVDAL 

....|....| 

1325 
ADIWVDEVS 
ADIWVDEVS 
ADIWVDEVS 
CDIVVVDEVS 
TDIVVVDEVS 
TDIVWDEVS 
TDIIVVDEVS 
CDILLVDEVS 
ADIVVFDEIS 



I I 

1275 
CAKAMTVYSI 
CAKAVT A Y S V 
CVKASTAYSN 
CEKAAKNFNV 
CEKAYKFLNI 
CEKAYKFLNI 
CEKAYKFLNI 
CEKAFKFLKV 
CEKALKYLPI 



1285 
DKCTRIIPAR 
DKCTRIIPAR 
DKCSRIIPQR 
DRCSRIIPQR 
NDCTRIVPAK 
NDCTRIVPAK 
NDCTRIVPAK 
DDCTRIVPQR 
DKCSRIIPAR 



....| I 

1295 
ARVEC Y S GFK 
ARVECYSGFK 
ARVECYDGFK 
IRVDCYTGFK 
VRVECYDKFK 
VRVECYDKFK 
VRVDCYDKFK 
TTVDCFSKFK 
ARVECFDKFK 



...,|....| 

1305 
PNNTSAQYIF 
PNNNSAQYVF 
SNNTSAQYLF 
PNNTNAQYLF 
INDTTRKYVF 
INDTTRKYVF 
VNDTTRKYVF 
ANDTGKKYIF 
VNSTLEQYVF 



I | 

1315 
STVNALPECN 
STVNALPEVN 
STVNALPECN 
CTVNALPEAS 
TTINALPEMV 
TTINALPEMV 
TTINALPELV 
STINALPEVS 
CTVNALPETT 



1335 
MCTNYDLSVI 
MCTNYDLSVI 
MCTNYDLSVI 
MCTNYDLSVI 
MLTNYELSVI 
MLTNYELSVI 
MLTNYELSVI 
MLTNYELSFI 
MATNYDLSVV 



i | 

1345 
NQRLSYKHIV 
NQRISYKHIV 
NQRISYRHVV 
NSRLSYKHIV 
NARIRAKHYV 
NARIRAKHYV 
NSRVRAKHYV 
NGKINYQYVV 
NARLRAKHYV 



| | 

1355 
YVGDPQQLPA 
YVGDPQQLPA 
YVGDPQQLPA 
YVGDPQQLPA 
YIGDPAQLPA 
YIGDPAQLPA 
YIGDPAQLPA 
YVGDPAQLPA 
YIGDPAQLPA 



. . . . I . I 

1365 
PRVMITKGVM 
PRVLISKGVM 
PRVMISRGTL 
PRTLINKGVL 
PRVLLSKGTL 
PRVLLSKGTL 
PRVLLNKGTL 
PRTLLN-GSL 
PRTLLTKGTL 



....|....| 

1375 
EPVDYNWTQ 
EPIDYNWTQ 
EPKDYNVVTQ 
QPQDYNVVTK 
EPKYFNTVTK 
EPKYFNTVTK 
EPRYFNSVTK 
SPKDYNVVTN 
EPEYFNSVCR 
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229E 
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BoCoV 
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MHV 

AIPV 
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EMCR 

229E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS CoV 



EMCR 
229E 
PEDV 

TGEV _.„". . 

BoCoV 

OC43 

MHV 

AIPV 

SARS CoV 



1385 
RMCAIGPDVF 
RMCAIGPDVF 
RMCALKPDVF 
RMCTLGPDVF 
LMCCLGPDIF 
LMCCLGPDIF 
LMCCLGPDIF 
LMVCVKPDIF 
LMKTIGPDMF 

1445 
GSSINRKQLE 
GSSINRRQLD 
GSSINRRQLD 
NSSINNKQLE 
SSAVNMQQIY 
SSAVNMQQIY 
SSAVNMQQIY 
GSAYNTTQLE 
SSAINRPQIG 



I 



1395 
LHKCYRCPAE 
LHKCYRCPAE 
LHKCYRCPAE 
LHKCYRCPAE 
LGTCYRCPKE 
LGTCYRCPKE 
LGTCYRCPKE 
LAKCYRCPKE 
LGTCRRCPAE 

• » ■ ■ | . . » . | 

1455 
IVKLFLVKNP 
WKRFIHKNS 
WRMFLAKNP 
WKAFLAHNP 
LINKFLKANP 
LINKFLKANP 
LISKFLKANP 
FVKDFVCRNK 
WREFLTRNP 



! 



....I. ...| 

1505 
IYAQTSDTAH 
IFAQTSDTAH 
IYAQTSDTAH 
IYTQTSDTQH 
IYSQTAETAH 
IYSQTAETAH 
IYSQTAETAH 
IFCVTADSQH 
I FTQTTETAH 



I .... I 

1515 
ACNVNRFNVA 
ACNANRFNVA 
ASNVNRFNVA 
ATNVNRFNVA 
SVNVNRFNVA 
SVNVNRFNVA 
SVNVNRFNVA 
ALNINRFNVA 
SCNVNRFNVA 



1405 
IVNTVSELVY 
IVNTVSELVY 
IVRTVSEMVY 
IVKTVSALVY 
IVDTVSALVY 
IVDTVSALVY 
IVDTVSALVY 
IVDTVSTLVY 
IVDTVSALVY 

1465 " * 1 
SWSKAVFISP 
TWSKAVFISP 
RWSKAVFISP 
KWRKAVFISP 
LWHKAVFISP 
LWHKAVFISP 
SWSNAVFISP 
QWREAIFISP 
AWRKAVFISP 

I 



1415 
ENKFVPVKPA 
ENKFVPVKEA 
ENQFIPVHPD 
ENKFVPVNPE 
ENKLKAKNES 
ENKLKAKNES 
HNKLKAKNDN 
DGKFIANNPE 
DNKLKAHKDK 

■ 

1475 
YNSQNYVASR 
YNSQNYVAAR 
YNSQNYVASR 
YNSQNYVARR 
YNSQNFAAKR 
YNSQNFAAKR 
YNSQNYVAKR 
YNAMNQRAYR 
YNSQNAVASK 



1425 
SKQCFKIFFK 
SKQCFKIFER 
SKQCFKIFCK 
SKQCFKMFVK 
SSLCFKVYYK 
SSLCFKVYYK 
SSMCFKVYYK 
SRECFKVIVN 
SAQCFKMFYK 



I I 

1435 

G NVQVDN 

G SVQVDN 

G NVQVDN 

G QVQIES 

G VTTHES 

G VTTHES 

G QTTHES 

NGNSDVGHES 
G VITHDV 



1485 
FLGLQIQTVD 
LLGLQTQTVD 
LLGLQIQTVD 
LLGLQTQTVD 
VLGLQTQTVD 
VLGLQTQTVD 
VLGLQTQTVD 
MLGLNVQTVD 
ILGLPTQTVD 



1495 
SSQGSEYDYV 
SAQGSEYDYV 
SSQGSEYDYV 
SAQGSEYDYV 
SAQGSEYDYV 
SAQGSEYDYV 
SAQGSEYDFV 
SSQGSEYDYV 
SSQGSEYDYV 



1525 
ITRAKKGIFC 
ITRAKKGIFC 
ITRAKKGILC 
ITRAKVGILC 
ITRAKKGILC 
ITRAKKGILC 
ITRAKKGILC 
LTRAKRGILV 
ITRAKIGILC 



1535 
VMCDKT-LFD 
IMSDRT-LFD 
IMCDRS— LFD 
IMCDRT-MYE 
VMSNMQ-LFE 
VMSNMQ-LFE 
VMSSMQ-LFE 
VMRQRDELYS 
IMSDRD-LYD 



1545 
SLKFFEIKHA 
ALKFFEITMT 
LLKFFELKLS 
NLDFYELKDS 
ALQFTTLTVD 
ALQFTTLTLD 
SLNFSTLTLD 
ALKFTELDSE 
KLQFTSLEIP 



I | 

1555 

— DLHSS 

— DLQSE 

— DLQAN 

KIGLQAKP — 
KVPQAVETRV 
KVPQAVETKV 

KIN NPRL 

TSLQG 

RRN-VATLQA 



1 I 

1565 
-QVCGLFKNC 
-SSCGLFKDC 
-EGCGLFKDC 
-ETCGLFKDC 
QCSTNLFKDC 
QCSTNLFKDC 
QCTTNLFKDC 

TGLFKIC 

ENVTGL FKDC 

. . . . | . . » . | 

1625 
FRFDISIPGS 
FRFDVSMPGS 
FRFDINIPNH 
FRFEANIPGY 
FKLDVTLDGY 
FKLDVTLDGY 
FKLDLTLDGY 
FKMSVNVEGC 
FKMNYQVNGY 



....|.,..| 

1575 
TRTPLNLPPT 
ARNPIDLPPS 
SRGDDLLPPS 
SKSEQYIPPA 
SKSYSGYHPA 
SKSYSGYHPA 
SRS YAGYHPA 
NKEFSGVHPA 
SKIITGLHPT 

....|. ...| 

1635 
HSLFCTRDFA 
HSLFCTRDFA 
HTLFCTRDFA 
HTLFCTRDFA 
CKLFITKEEA 
CKLFITKEEA 
CKLFITRDEA 
HNMFITRDEA 
PNMFITREEA 



EMCR 
22 9E 
PEDV 
TGEV 
BoCoV 



....!.... I 

1685 
QTEGCVSTNF 
QPEGCVLTNT 
RPEGCWTES 
QTEGCVITEK. 
EATGLFADRD 
EATGLFADRD 
EATGMFAERD 
TPEGLVDTSI 
VPTGYVDTEN 

I | 

1745 
ILVFVLWAGS 
VLVFVLWAGG 
ILIFVLWAGG 
ILIFVLWAGG 
CWLVTWAAN 



1695 
GDVIKPVCAK 
GSVVKPVRAR 
GDYIKPVRAR 
GNSIEWKAR 
G Y S FKKAVAK 
GYSFKKAVAK 
GYVFKKAVAR 
GNNFEPVNSK 
NTEFTRVNAK 

I I 

1755 
LELTTMRYFV 
LELTTMRYFV 
LELTTMRYFV 
LELTTMRYFV 
FELTCLRYFA 



I .... I 

1585 
HAHTFLSLSD 
HATTYLSLSD 
' HANTFMSLAD 
YATTYMSLSD 
HAPSFLAVDD 
HAPSFLAVDD. 
HAPSFLAVDD 
YAVTTKALAA 
QAPTHLSVDI 

I | 

1645 
IRNVRGWLGM 
MRHVRGWLGM 
MRNVRGPfLGF 
MRNVRAWLGF 
VKRVRAWVGF 
VKRVRAWVGF 
I RRVRAW VGF 
IRNVRGWVGF 
IRHVRAWIGF 



1595 
QFKTTGDLAV 
RFKTSGDLAV 
NFKTDQYLAV 
NFKTSDGLAV 
KYKATGDLAV 
KYKATGDLAV 
KYKVGGDLAV 
TYKVNDELAA 
KFKTEG-LCV 

....| | 

1655 
DVESAHVCGD 
DVEGAHVTGD 
DVEGAHVVGS 
DVEGAHVCGD 
DAEGAHATRD 
DAEGAHATRD 
DAEGAHATRD 
DVEATHACGT 
DVEGCHATRD 



1605 
QIGSNN — VC 
QIGNNN — VC 
QIGVNG— PI 
NIG-TK — DV 
CLGIGD-SAV 
CLGIGD-SAV 
CLNVAD-SAV 
LVNVEAGSEI 
DIPGIP-KDM 

I | 

1665 
NIGTNVPLQV 
NVGTNVPLQV 
NVGTNVPLQL 
NVGTNVPLQL 
SIGTNFPLQL 
SIGTNFPLQL 
SIGTNFPLQL 
NIGTNLPFQV 
AVGTNLPLQL 



1615 
TYEHVISFMG 
TYEHVISYMG 
KYEHVISFMG 
KYANVISYMG 
TYSRLISLMG 
TYSRLISLMG 
TYSRLISLMG 
TYKHLISLLG 
TYRRLISMMG 

• ••• ).♦,,{ 

1675 
GFSNGVNFW 
GFSNGVDFVA 
G FSNGVDF V V 
GFSNGVDFVV 
GFSTGIDFW 
GFSTGI DFVV 
GFSTGIDFVV 
GFSTGADFVV 
GFSTGVNLVA 



1705 
SPPGEQFRHL 
APPGEQFTHI 
APPGEQFAHL 
APPGEQFAHL- 
APPGEQFKHL 
APPGEQFKHL 
APPGEQFKHL 
APPGEQFNHL 
PPPGDQFKHL 

I | 

1765 
KIGPIKYCY- 
KIGAVKHCQ- 
KIGPSKSCD- 
KIGRPQKCE- 
KVGREISCNV 



1715 
VPFLRKGQPW 
VPLLRKGQPW 
LPLLBCRGQPW 
IPLMRKGQPW 
IPLMTRGQRW 
IPLMTRGHRW 
VPLMSRGQKW 
RVLFKSAKPW 
IPLMYKGLPW 

I ...,| 

1775 
CGNSATCYNS 
CGTVATCYNS 
CGKVATCYNS 
CGKSATCYSS 
STKRATAYNS 



1725 
LIVRRRIVQM 
SVLRKRIVQM 
DVVRKRIVQM 
HIVRRRIVQM 
DVVRPRIVQM 
DVVRPRIVQM 
DVVRIRIVQM 
HVIRPRIVQM 
NWRIKIVQM 

• •••}...• | 

1785 
VSNEYCCFKH 
VSNDYCCFKH 
ALHTYCCFKH 
SQSVYACFKH 
RTGYYGCWRH 



I 



1735 
ISDYLSNLSD 
IAD FLAGS SD 
CSDYLANLSD 
VC DYFDGLS D 
FADHLIDLSD 
FADHLIDLSD 
LSDHLVDLAD 
LADNLCNVSD 
LSDTLKGLSD 

| | 

1795 
ALGCDYVYNP 
ALGCDYVYNP 
ALGCDYLYNP 
ALGCDYLYNP 
SVTCDYLYNP 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



OC43 

MHV 

AIPV 

SARS COV 



EMCR 

229E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS COV 



EMCR 

229E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS COV 



EMCR 

229E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS CoV 



EMCR 

229E" 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

BOCOV 

OC43 . 

MHV 

AIPV 

SARS CoV 



70 



75 



80 



EMCR 

229E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS CoV 



EMCR 



CWLVTWAAN FELTCLRYFA KVGREISCNV CTKRATVYNS RTGYYGCWRH SVTCDYLYNP 
SVVLVTWAAS FELTCLRYFA KVGKEVVCSV CNKRATCFNS RTGYYGCWRH SYSCDYLYNP 
CVVFVTWCHG LELTTLRYFV KIGKEQVCS- CGSRATTFNS HTQAYACWKH CLGFDFVYNP 
RVVFVLWAHG FELTSMKYFV KIGPERTCCL CDKRATCFST SSDTYACWNH SVGFDYVYNP 



1805 
YAFDIQQWGY 
YVIDIQQWGY 
YCIDIQQWGY 
YCIDIQQWGY 
LIVDIQQWGY 
LIVDIQQWGY 
LIVDIQQWGY 
LLVDIQQWGY 
FMIDVQQWGF 

I 



,...|. ...I 

1815 
VGSLSQNHHT 
VGSLSTNHHA 
KGSLSLNHHE 
TGSLSMNHHE 
IGSLSSNHDL 
IGSLSSNHDL 
TGSLTSNHDL 
SGNLQFNHDL 
TGNLQSNHDQ 



. ... I .... I 

1825 
FCNIHRNEHD 
ICNVHRNEHV 
HCNVHRNEHV 
VCNIHRNEHV 
YCSVHKGAHV 
YCSVHKGAHV 
ICSVHKGAHV 
HCNVHGHAHV 
HCQVHGNAHV 



1835 
ASGDAVMTRC 
ASGDAIMTRC 
ASGDAIMTRC 
ASGDAIMTRC 
ASSDAIMTRC 
ASSDAIMTRC 
ASSDAIMTRC 
ASVDAIMTRC 
ASCDAIMTRC 



I 



1845 
LAVHDCFVKN 
LAVYDCFVKN 
LAIHDCFVKN 
LAIHDCFVKR 
LAVYDCFCNN 
LAVYDCFCNN 
LAVHDCFCKS 
LAINNAFCQD 
LAVHECFVKR 



I | 

1855 
VDWTVTYPFI 
VDWSITYPMI 
VDWSITYPFI 
VDWSIVYPFI 
INWNVEYPII 
INWNVEYPII 
VNWSLEYPII 
VNWDLTYPHI 
VDWSVEYPII 



1865 
ANEKFINGCG 
ANENAINKGG 
GNEAVINKSG 
DNEEKINKAG 
SNELSINTSC 
SNELSINTSC 
SNEVSVNTSC 
ANEDEVNSSC 
GDELRVNSAC 



....|....| 

1875 
RNVQGHWRA 
RTVQSHIMRA 
RIVQSHTMRS 
RIVQSHVMKA 
RVLQRVMLKA 
RVLQRVILKA 
RLLQRVMFRA 
RYLQRMYLNA 
RKVQHMWKS 



1925 

VKLLDYD 

VKTLEYD 

VKTLEYD 

VRCLDYD 

VKTLLYF 

VKTLLYS 

VKQFVYK 

VKQFEYD 

AYKIEELFYS 



1 I 

1935 
YATHG — QLD 
YMTHG — QMD 
YITHG — QFD 
YMVHG — QMN 
FEAHKDSFKD 
FEAHKDSFKD 
YEAHKDQFLD 
YNQHKDKFAD 
YATHHDKFTD 



1885 
ALKLYKPSVI 
AIKLYNPKAI 
VLKLYNPKAI 
ALKIFNPAAI 
AMLCNRYTLC 
AMLCNRYTLC 
AMLCNRYDVC 
CVDALKVNW 
ALL ADKFP V L 

1 



... - I 1 

1895 
HDIGNPKGVR 
HDIGNPKGIR 
YDIGNPKGIR 
HDVGNPKGIR 
YDIGNPKAIA 
YDIGNPKAIA 
YDIGNPKGLA 
YDIGNPKGIK 
HDIGNPKAIK 



1945 
GLCLFWNCNV 
GLCLFWNCNV 
GLCLFWNCNV 
GLMLFWNCNV 
GLCMFWNCNV 
GLCMFWNCNV 
GLCMFWNCNV 
GLCMFWNCNV 
GVCLFWNCNV 



.... 1 I 

1955 
DMYPEFSIVC 
DMYPEFSIVC 
DMYPEFSVVC 
DMYPEFSIVC 
DKYPPNAWC 
DKYPPNAVVC 
DKYPANAVVC 
DCYPDNSLVC 
DRYPANAIVC 



• . . . 1 . 1 

1905 
CA-VTDAKWY 
CA-VTDAKWY 
CA-VTDAKWF 
CA-TTPIPWF 
CV — KDFDFK 
CV — KDFDFK 
CV — KGYDFK 
CVRRGDVNFR 
CVPQAEVEWK 

I 



....|. ...| 

1915 
CYDKQPVNSN 
CYDKNPINSN 
CFDKNPTNSN 
CYDRDPINNN 
FYDAQPIVKS 
FYDAQPIVKS 
FYDASPVVKS 
FYDKNPIVRN 
FYDAQPCSDK 



1965 
RFDTRTRSVF 
RFDTRTRSTL 
RFDTRCRS PL 
RFDTRTRSKL 
RFDTRVLNNL 
RFDTRVLNNL 
RFDTRVLNKL 
RYDTRNLSVF 
RFDTRVLSNL 



....I | 

1975 
NLEGVNGGSL 
NLEGVNGGSL 
NLEGCNGGSL 
SLEGCNGGAL 
NLPGCNGGSL 
NLPGCNGGSL 
NLPGCNGGSL 
NLPGCNGGSL 
NLPGCDGGSL 



1985 
YVNKHAFHTP 
YVNNHAFHTP 
YVNNHAFHTP 
YVNNHAFHTP 
YVNKHAFHTK 
YVNKHAFHTK 
YVNKHAFHTS 
YVNKHAFYTP 
YVNKHAFHTP 



....|.... | 

1995 
AYDKRAFVKL 
AYDKRAMAKL 
AFDKRAFAKL 
AYDRRAFAKL 
PFSRAAFEHL 
PFARAAFEHL 
PFTRAAFENL 
KFDRISFRNL 
AFDKSAFTNL 



1 .... I 

2005 
KPMPFFYFDD. 
KPAPFFYYDD 
KPMPFFFYDD 
KPMPFFYYDD 
KPMPFFYYSD 
KPMPFFYYSD 
KPMPFFYYSD 
KAMPFFFYDS 
KQLPFFYYSD 



I .... I 

2045 
GGAVCSKHAN 
GGAVCSKHAN 
GGAVCS KHC A 
GGAVCKKHAA 
GGAVCLKHAE 
GGAVCLKHAE 
GGAVCLKHAE 
GGAVCKKHAQ 
GGAVCRHHAN 



.... I I 

2055 
LYQKYVEAYN 
LYRAYVESYN 
MYHSYVNAYN 
LYRAYVEDYN 
EYREYLESYN 
EYREYLESYN 
DYREYLESYN 
MYAEFVTSYN 
EYRQYLDAYN 



....|....| 

2065 
TFTQAGFNIW 
IFTQAGFNIW 
TFTSAGFTIW 
IFMQAGFTIW 
TATTAGFTFW 
TATTAGFTFW 
TATTAGFTFW 
AAVTAGFTFW 
MMISAGFSLW 



I I 

2015 

SDCDWQ 

GSCEVVH 

TECDKLQ 

SNCELVD 

TPCVYMDGMD 
TPCVYMDGMD 
TPCVYMEGME 
SPCETIQ-VD 
SPCESHGKQV 

I 



I I 

2025 
-EQVNYVPLR 
-DQVNYVPLR 
-DSINYVPLR 
-GQPNYVPLK 
AKQVDYVPLK 
AKQVDYVPLK 
SKQVDYVPLR 
GVAQDLVS LA 
VSDIDYVPLK 



1 | 

2035 
ASSCVTRCNI 
ATNCITKCNI 
ASNCITKCNV 
SNVCITKCNI 
SATCITRCNL 
SATCITRCNL 
SATCITRCNL 
TKDCI TKCNI 
SATCITRCNL 



I 



2075 
VPHSFDVYNL 
VPTTFDCYNL 
VPTSFDTYNL 
CPQNFDTYML 
VYKTFDFYNL 
VYKTFDFYNL 
VYKTFDFYNL 
VTNKLNPYNL 
IYKQFDTYNL 



I.. 

2085 
WQIFIET-NL 
WQTFTEV-NL 
WQTFSN — NL 
WHGFVNSKAL 

WNTFTK L 

WNTFTK L 

WNTFTR L 

WKSFSA L 

WNTFTR L 



....|....| 

2095 
QSLENIAFNV 
QGLENIAFNV 
QGLENIAFNV 
QSLENVAFNV 
QSLENVVYNL 
QSLENVVYNL 
QSLENVVYNL 
QSIDNIAYNM 
QSLENVAYNV 



2105 
VKKGCFTGVD 
VNKGSFVGAD 
LKKGSFVGDE 
VKKGAFTGLK 
VKTGHYTGQA 
VKTGHYTGQA 
VNAGHFDGRA 
YKGGHYDAIA 
VNKGHFDGHA 



....|....| 

2115 
GELPVAWND 
GELPVAISGD 
GELPVAWND 
GDLPTAVIAD 
GEMPCAIIND 
GEMPCAI IND 
GELPCAVIGE 
GEMPTVITGD 
GEAPVSIINN 



I .... I 

2125 
KVFVRYGDVD 
KVFVRDGNTD 
KVLVRDGTVD 
KIMVRDGPTD 
KVVAKIDKED 
KVVAKIDKED 
KVIAKIQNED 
KVFVIDQGVE 
AVYTKVDGID 



I | 

2135 
NLVFTNKTTL 
NLVFVNKTSL 
TLVFTNKTSL 
KCIFTNKTSL 
VVIFINNTTY 
WIFINNTTY 
VVVFKNNTPF 
KAVFVNQTTL 
VEIFENKTTL 



I I 

2145 
PTNVAFELFA 
PTNIAFELFA 
PTNVAFELYA 
PTNVAFELYA 
PTNVAVELFA 
PTNVAVELFA 
PTNVAVELFA 
PTSVAFELYA 
PVNVAFELWA 



....1 I 

2155 
KRKMGLTPPL 
KRKVGLTPPL 
KRKVGLTPPI 
KRKLGLTPPL 
KRSIRHHPEL 
KRSVRHHPEL 
KRSIRPHPEL 
KRNIRTLPNN 
KRNIKPVPEI 



. ... I .... I ....|....| 
2165 2175 2185 2195 

SILKNLGVVA TYKFVLWDYE AERPFTSYTK SVCKYTDFN- 



..|.... I ....|....| 
2205 2215 
EDV CVCFDNSIQG 



22 9E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS CoV 



EMCR 

22 9E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS CoV 



EMCR 

22 9E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS CoV 



EMCR 
229E 
PEDV 
TGEV 
BoCoV 
OC43 • 
MHV 
AIPV • 
SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

BoCoV 

OC43 

MHV 

AIPV 

SABiS'CoV 



EMCR 

229E 

PEDV 

TGEV 

BOCOV 

OC43 

MHV 

AIPV 

SARS CoV 



SILKNLGVVA 
TILRNLGWC 
TILRNLGWA 
KLFRNLNIDV 
KLFRNLNIDV 
KLFRNLNIDV 
RILKGLGVDV 
KILNNLGVDI 



TYKFVLWDYE 
TSKCVIWDYE 
TYKFVLWDYE 
CWKHVIWDYA 
CWKHVIWDYA 
CWSHVLWDYA 
TNGFVIWDYA 
AANTVIWDYK 



....I | 

2225 
SYERFTLTTN 
SYERFTLSTN 
SLERFSMTQN 
SFERFTTTRD 
ALEAFKRSNN 
ALEAFKRSNN 
ALEAFKKCRD 
DYQSFLAADN 
QVDLFRNARN 





2235 
AVLFSTVVIK 
AVLFSATAVK 
AVLMSLTAVK 
AVLISNNAVK 
GVYISTTKVK 
GVYISTTKVK 
GVYINTTKIK 
AVLVSTQCYK 
GVLITEGSVK 



AERPLTSFTK 
AERPLTTFTK 
AERPFSNFTK 
RESIFCSNTY 
RESIFCSNTY 
KDSVFCSSTY 
NQTPLYRNTV 
REAPAHVSTI 

I .... 1 



SVCGYTDFA- 
DVCKYTDFE- 
QVCSYTDLD- 
GVCMYTDLK- 
GVCMYTDLK- 
KVCKYTDLQ- 
KVCAYTDIE- 
GVCTMTDIAK 



EDV CTCYDNSIQG 

GDV CTLFDNSIVG 

SEV VTCFDNSIAG 

LIDKL NVLFDGRDNG 

FIDKL NVLFDGRDNG 

CIESL NVLFDGRDNG 

PNGL VVLYDDR-YG 

KPTESACSSL TVLFDGRVEG 



2245 

N LTPIK 

TGGKSLPAIK 

K LTGIK 

G LSAIK 

S LSMIR 

S LSMIR 

S LSMIK 

R YSYVE 

G LTPSK 



] 1 

2255 
LNFGMLNGMP 
LNFGMLNGNA 
LTYGYLNGVP 
LQYGLLNDLP 
GPPRAELNGV 
GPPRAELNGV 
GPQRADLNGV 
IPSNLLVQNG 
GPAQASVNGV 



I I 

2265 
VSSIKSDKGV 
IATVKSEDGN 

VNTHED 

VSTVGN 

WDKVGD 

VVDKVGD 

WEKVGD 

MPLKDG 

TLIGES 



t r 

2275 
EKLVNWYTYV 
IKNINWFVYV 
-KPFTWYIYT 
-KPVTWYIYV 
-TDCVFYFAV 
-TDCVFYFAV 
-SDVEFWFAM 

ANLYVYK 

-VKTQFNYFK 



....)....| . ... ) .... | 
2285 2295 

RKNG QFQDH 

RKDG KPVDH 

RKNG KFEDY 

RKNG EYVEQ 

RKEGQDVIFS QFDSLRVSSN 
RKEGQDVIFS QFDSLGVSSN 
RRDGDDVIFS RTGSLEPSHY 

RVNG AFVTL 

KVDG . — IIQQL 



I. 



2305 



I 



2315 



\ 



..|....| 
2325 
-DGFYTQ 
-DGFYTQ 
-DGYFTQ 
-DSYYTQ 



y 

y 

p 

I 

QSPQGNLGSN -EPGNVGGND ALATSTIFTQ 
QSPQGNLGSN GKPGNVGGND ALSISTIFTQ 
RSPQGNPGGN -RVGDLSGNE ALARGTIFTQ 

p NTINTQ 

p ETYFTQ 



| | 

2335 
GRNLSDFTPR 
GRNLQDFLPR 
GRTTADFSPR 
GRTFETFKPR 
SRVISSFTCR 
SRVISSFTCR 
SRFLSSFAPR 
GRSYETFEPR 
SRDLEDFKPR 



I | 

2345 
SDMEYDFLNM 
STMEEDFLNM 
SDMEKDFLSM 
STMEEDFLSM 
TDMEKDFIAL 
TDMEKDFIAL 
SEMEKDFMDL 
SDIERDFLAM 
SQMETDFLEL 



1 | 

2355 
DMGVFINKYG 
DIGVFIQKYG 
DMGLFINKYG 
DTTLFIQKYG 
DQDVFIQKYG 
DQDVFIQKYG 
DEDVFIAKYS 
SEESFVERYG 
AMDEFIQRYK 



\ 



2365 
LEDFNFEHW 
LEDFNFEHW 
LEDYGFEHW 
LEDYGFEHW 
LEDYAFEHIV 
LEDYAFEHIV 
LQDYAFEHW 
-KDLGLQHIL 
LEGYAFEHIV 



I .... I 

2375 
YGDVSKTTLG 
YGDVSKTTLG 
YGDVSKTTLG 
FGDVSKTTIG 
YGNFNQKHG 
YGNFNQKIIG 
YGSFNQKIIG 
YGEVDKPQLG 
YGDFSHGQLG 



I 



2385 
GLHLLISQFR 
GLHLLISQVR 
GLHLLISQVR 
GMHLLISQVR 
GLHLLIGLYR 
GLHLLIGLYR 
GLHLLIGLAR 
GLHTVIGMYR 
GLHLMIGLAK 



| | 

2395 
LSKMGVLKAD 
LSKMGILKAE 
LACMGVLKID 
LAKMGLFSVQ 
RQQTSNLVIQ 
RQQTSNLWQ 
RQQKSNLVIQ 
LLRANKLNAK 
RSQDSPLKLE 



I I 

2405 
DFVTASDTTL 
EFVAASDITL 
EFVSSNDSTL 
EFMNNSDSTL 
EFVS-YDSSI 
EFVS-YDSSI 
EFVP-YDSSI 
SVTN-SDSDV 
DF1P-MDSTV 



....!....) 

2415 
RCCTVTYLNE 
KCCTVTYLND 
KSCTVTYADN 
KSCCITYADD 
HSYFITDEKS 
HSYFITDEKS 
HSYFITDENS 
MQNYFVLSDN 
KNYFITDAQT 



....1 ....I 

2425 
LSSKVVCTYM 
PSSKTVCTYM 
PSSKMVCTYM 
PSSKNVCTYM 
GGSKSVCTVI 
GGSKSVCTVI 
GSSKSVCTVI 
GSYKQVCTW 
GSSKCVCSVI 



I .... I 

2435 
DLLLDDFVTI 
DLLLDDFVSV 
DLLLDDFVSI 
DILLDDFVTI 
DILLDDFVAL 
DILLDDFVAL 
DLLLDDFVDI 
DLLLDDFLEL 
DLLLDDFVEI 



I . I 

2445 

LK SLDLG 

LK SLDLT 

LK SLDLS 

IK SLDLN 

VK SLNLN 

VK SLNLN 

VK SLNLN 

LRNILKEYGT 
IK SQDLS 



| | 

2455 
VISKVHEVII 
VVSKVHEVII 
VVSKVHEVMV 
WSKWDVIV 
CVSKVVNVNV 
CVSKWNVNV 
CVSKVVNVNV 
NKSKVVTVSI 
VISKVVKVTI 



I 

2465 
DNKPYRWMLW 
DNKPWRWMLW 
DCKMWRWMLW 
DCKAWRWMLW 
DFKDFQFMLW 
DFKDFQFMLW 
DFKDFQFMLW 
DYHSINFMTW 
DYAEISFMLW 



I .... I 

2475 
CKDNHLSTFY 
CKDNAVATFY 
CKDHKLQTFY 
CENSHIKTFY 
CNDEKVMTFY 
CNDEKVMTFY 
CNEEKVMTFY. 
EEDGSIKTCY 
CKDGHVETFY 



2485 
PQLQS-AEWK 
PQLQS-AEWK 
PQLQA-SEWK 
PQLQS-AEWN 
PRLQAASDWK 
PRLQAASDWK 
PRLQAAADWK 
PQLQS — AWT 
PKLQASQAWQ 



.... | | 

2525 
PSGIMLNWK 
PSGIMFNWK 
PDGIMFNVVK 
PDGITTNVVK 
PTGCMMNVAK 
PTGCMMNVAK 
PTGCLMNVAK 
PSGILMNVAK 
PKGIMMNVAK 



....!....! 

2535 
YTQLCQYLNS 
YTQLCQYFNS 
YTQLCQYLNS 
YTQLCQYLNT 
YTQLCQYLNT 
YTQLCQYLNT 
YTQLCQYLNT 
YTQLCQYLSK 
YTQLCQYLNT 



....)....! 

2545 
TTMCVPHNMR 
TTLCVPHNMR 
TTMCVPHHMR 
TTLCVPHKMR 
TTLAVPVNTR 
TTLAVPVNMR 
TTLAVPANMR 
TTICVPHNMR 
LTLAVPYNMR 



I I 

2495 
CGYAMPQIYK 
CGYSMPGIYK 
CGYSMPSIYK 
PGYSMPTLYK 
PGYSMPVLYK 
PGYSMPVLYK 
PGYVMPVLYK 
CGYNMPELYK 
PGVAMPNLYK 

J 



2555 
VLHYGAGSDK 
VLHLGAGSDY 
VLHLGAGSDK 
VLHLGAAGAS 
VLHLGAGSEK 
VLHLGAGSEK 
VLHLGAGSDK 
VMHFGAGSDK 
VIHFGAGSDK 



| | 

2505 
LQRMCLEPCN 
TQRMCLEPCN 
IQRMCLEPCN 
IQRMCLERCN 
YLN SPMERVS 
YLNSPMERVS 
YLESPLERVN 
VQNCVMEPCN 
MQRMLLEKCD 



I | 

2515 
LYNYGAGIKL 
LYNYGAGLKL 
LYNYGAGVKL 
LYNYGAQVKL 
LWNYGKPVTL 
LWNYGKPVTL 
LWNYGWI.TL. 
IPNYGVGITL 
LQNYGENAVI 



I .... 1 

2565 
GVAPGTTVLK 
GVAPGTAVLK 
GVAPGTAVLR 
GVAPGSTVLR 
GVAPGSAVLR 
GVAPGSAVLR 
DVAPGSAVLR 
GVAPGSTVLK 
GVAPGTAVLR 



....| I 

2575 

RWLPPD 

RWLPHD 

RWLPLD 

RWLPDD 

QWLPAGTILR 

QWLPAG 

QWLPAG 

QWLPEG 

QWLPTG 



EMCR 
5 229E 
PEDV 
TGBV 
BoCoV 
OC43 
10 MHV 
.AIPV 
SARS CoV 



15 

EMCR 
229E 
PEDV 
TGEV 
2 0 BoCoV 
OC43 
MHV 
AIPV 

SARS CoV 

25 



EMCR 

229E 
3 0 PEDV 

TGEV 

BoCoV 

OC43 

MHV 
35 AIPV 

SARS CoV 



4 0 EMCR 

229E 

PEDV 

TGEV 

BoCoV 
45 OC43 

MHV 

AIPV 

SARS CoV 



2585 

AIII 

AIW 

AIIV 

AILV 

QWLPAGTILV 

TILV 

SILV 

TLLV 

TLLV 

I I 

2645 
NVSKDGFFTY 
NVSKEGFFTY 
NVSKEGFFPY 
NTSKDGFFTY 

VVRCS YI 

NVSKDGFFTY 
NVSKDGFFTY 
NNGNDDVFIY 
NDSKEGFFTY 



| I 

2595 
DNDINDYVSD 
DNDVVDYVSD 
DNDSVDYVSD 
DNDLRDYVSD 
HNDLYPFVSD 
DNDLYPFVSD 
DNDINPFVSD 
DNDIVDYVSD 
DSDLNDFVSD 



....|....| 

2605 
ADFSITGDCA 
ADFSVTGDCA 
ADYSVTGDCS 
ADFSVTGDCT 
SVATYFGDCI 
SVATYFGDCI 
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EFYAWIDAI EEKLSPCKEL EGVGAKVSAF LQKLEDNPLF 
EFYAWIDAI EEKLSPCKEL EGVGAKVSAF LQKLEDNSLF 
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NVALVDKNGK DLDCIKSCHL IYR 

EFACVVAEAV VKTLQPVSDL LTN MGID LDEWS VAT FY 
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KCVDIVKEAQ SANPMVIVNA 
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ANIHLKHGGG 
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YIYDEEGGYD 
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YIYDEEGGTD 
FIYDTCGGFD 
YSQLFVDTLV 
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KSVPKSIILP 
KTMTEQVWE 
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VAYWKCIKCD 
YANWRCLKCD 
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GAEGTSSQEE 
LDAMFFYGDV 
LDAMFFYGDV 
LDAMFFYGDV 



VANSEPGDDG 
VETVEVADIT 
VSHICKCGES 
VSHVCKCGES 
VSHVCKCGTG 



LQSLQVCVQT VRTQVYIAVN 
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HE 
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DKALYEQVVM 
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IETVDVKHDV 
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DYLDNLKPRV 
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S 

SFIKDTPSTV 
AVDVQEAEQF 
FCAFITKRIV 
FCAFITKRSV 
FCAFYTPRKV 

S 

EAPKQEEPPN 
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VD 
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VDSSAPEKVA 
PENEIVEASE 

LALK LKG 
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VD 

AGIFG — AKP 
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YKAACWDVN 
FRAACVVDVN 
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TEDSKTEEKS 
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EMCR SLGDSGKLLS ELLKDKYTCS ITFEMSCDCG KKFDEQVGCL FWIMPYTKLF QKGECCICHK 

229E AMGDVGLCMY RLLKDLHTGF MVMDYKCSCT SGRLEESGAV LFCTPTKKAF PYGTCLNCNA 

PEDV SLGDVSACLE SLTKDLHTLK ITCSVVCGCG TGERIYEGCA FRMTPTLEPF PYGACAQCAQ 

TGEV HSGDAEYLLE LMLNDYSTAK IVLAAKCGCG EKEIVLERAV FKLTPLKESF NYGVCGDCMQ 

OV43 KGDIIKVSKL VKAEVWNPA NGHMAHGGGV AKAIAVAAGQ QFVKETTDMV KSKGVCATGD 

BOCOV KGDIIKVSKR VKAEVWNPA NGHMAHGGGV AKAIAVAAGQ QFVKETTDMV KSKGVCATGD 

MHV KGDVIKVLRR VGAEVIVNPA NGRMAHGAGV AGAIAKAAGK SFIKETADMV KNQGVCQVGE 

AIBV EFKEFCIVNA ANEHMTHGSG VAKAIADFCG LDFVEYCEDY VKKHGPQQRL VTPSFVKGIQ 

SARS COV EKDAPYMVGD VITSGDITCV VIPSKKAGGT TEMLSRALKK VPVDEYITTY PGQGCAGYTL 
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1385 1395 1405 1415 1425 1435 

EMCR MQTYKLVSMK GTGVFVQD — PAPIDIDAFP VRPICSSVYL GVKGSGHYQT NLYSFDKAID 

229E PRMCTIRQLQ GTIIFVQQK- PEPVNPVSFV VKPVCSSIFR GAVSCGHYQT NIYSQNLCVD 

15 PEDV VLMHTFKSIV GTGIFCRD — TTALSLDSLV VKPLCAAAFI GK-DSGHYVT NFYDAAMAID 

TGEV VNTCRFLSVE GSGVFVHDIL SKQTPEAMFV VKPVMHAVYT GTTQNGHYMV DDIEHGYCVD 

0V43 CYVSTGGKLC KTVLNWGPD ARTQGKQSYV LLERVYKHLN NYDCWTTLI SAGIFSVPSD 

BoCoV CYVSTGGKLC KTVLNWGPD ARTQGKQSYA LLERVYKHLN KYDCVVTTLI SAGIFSVPSD 

MHV CYESTGGNLC KTVLNIVGPD ARGHGKQCYS FLERAYQHIN KCDDWTTLI SAGIFSVPTD 

20 AIBV CVNNWGPRH GDNNLHEKLV AAYKNVLVDG WNYVVPVLS LGIFGVDFKM SIDAMREAFE 

SARS CoV EEAKTALKKC KSAFYVLPSE APNAKEEILG TVSWNLREML AHAEETRKLM PICMDVRAIM 

. ... | .... | I .... I ....I.... I 

1445 1455 1465 1475 1485 1495 

25 EMCR GFGVFDIK NSSV NTVCFVDVDF HS-VEIEAGE VK 

229E GFGVNKIQP WTNDAL NTICIKDADY NAKVEISVTP IKNTVDTTPK 

PEDV GYGRHQIK YDTL NTICVKDVNW TAPLVPAVDS VVEP 

TGEV GMGIKPLKKR CYTSTLFINA NVMTRAEKPK QEFKVEKVEQ QPIVEENKSS IEKEEIQSPK 

OV43 VSLTYLLGTA KKQWLVSNN QEDFDLISKC QITAVEG-TK KLAARLSFNV GRSIVYETDA 

30 BoCoV VSLTYLLGTA KKQWLVSNN QEDFDLISKC QITAVEG-TK KLAERLSFNV GRSIVYETDA 

MHV VSLTYLIGW TKNVILVSNN KDDFDVIEKC QVTSIAG-TK ALSLQLAKNL CRDVKFETNA 

AIBV GCTIRVLLFS LSQE HIDYFDVTCK QKTIYLTEDG VKYR 

SARS CoV ATIQRKYKGI KIQEGIVDYG VRFFFYTSKE PVASIITKLN SLNEPLVTMP IGYVTHGFNL 

35 . ... i .... i — . i .... i . . . . i — 1 — l 1 l. .-.I 

1505 1515 1525 1535 1545 1555 

EMCR PFAVYKNVKF YLGDISHLVN CVS FDFWNA ANENLMHGGG VARAIDILTE 

229E * EEFVVKEKLN AFLVHDNVAF YQGDVDTVVN GVDFDFIVNA ANENLAHGGG LAKALDVYTK 

p EDV VVK PFYSYKNVDF YQGDFSDLVK -LPCDFVVNA ANEKLSHGGG I AKAI DVYTK 

40 TGEV ND DLIL PFYKAGKLSF YQGALDVLIN FLEPDVIVNA ANGDLKHMGG VARAI DVFTG 

OV43 NKLILIN DVAFVSTFNV LQDVLSLRHD IALDDDARTF VQSNVDVVPE GWRVVNKFYQ 

BoCoV NKLILSN DVAFVSTFNV LQDVLSLRHD IALDDDARTF VQSNVDVVPE GWRVVNKFYQ 

MHV • CDSLFS DSCFVSSYDV LQEVELLRHD IQLDDDARVF VQAHMDNLPA DWRLVNKFDS 

AIBV SIVLKPG DSLGQFGQVY AKNKIVFTAD DVEDKE I L Y V 

45 SARS CoV EEAARCMR — SLKAPAWSV SSPDAVTTYN GYLTSSSKTS EEHFVETVSL AGSYRDWSYS 

. . . . i i — i .... i . . - . i — I I ../.I. ...I 

1565 1575 1585 1595 1605 1615 

EMCR GQLQSLSKDY ISSNGPLKVG AGVMLE — CE KFN — VFNW GPRTG KHEHSLLVEA 

50 229E GKLQRLSKEH IGLAGKVKVG TGVMVE — CD SLR — IFNW GPRKG KHERDLLIKA 

PEDV GMLQKCSNDY IKAHGPIKVG RGVMLE — AL GLK — VFNW GPRKG KHAPELLVKA 

TGEV GKLTERSKDY LKKNKSIAPG NAVFFEN V I E HLS — VLNAV GPRNGD SRVEAKLCNV 

OV43 INGVRT-VKY FECTGGIDIC SQDKVFGYVQ QGIFNKATVA QIKALF LDKVDILLTV 

BoCoV INGVRP-VKY FECPGGIDIC SQDKVFGYVQ QGSFNKATVA QIKALF LDKVDILLTV 

55 MHV VDGVRT-VKY FECPGEIFVS SQGKKFGYVQ NGSFKVASVS QIRALL AN K VDVLCT V 

AIBV PTTDKSILEY YGLDAQKYVI YLQTLAQKWN VQYRDNFLIL EWRDGN — CW ISSAIVLLQA 

SARS COV GQRTELGVEF LKRGDKIVYH TLESPVEFHL DG — EVLSLD KLKSLLSLRE VKTIKVFTTV 

....( | ....|....| ....|....| I I .---I. ..-I ...-I...-! 

60 ' 1625 1635 1645 1655 1665 1675 

EMCR YNSILFENGI PLMPLLSCGI FGVRIENSLK ALFSCDINKP LQVFVYSSNE EQAVLKFLDG 

229E YNTINNEQGT PLTPILSCGI FGIKLETSLE VLLDVCNTKE VKVFVYTDTE VCKVKDFVSG 

PEDV YKSVFANSGV ALTPLISVGI FSVPLEESLS AFLACVGDRH CKCFCYGDKE REAIIKYMDG 

TGEV YKAIAKCEGK ILTPLISVGI FNVRLETSLQ CLLKTVNDRG LNVFVYTDQE RQTIENFFS- 

65 OV43 DGVNFTNRFV PVGESFGKSL GNVFCDGVNV TKHKCDINYK GKVFFQFDNL SSEDLKAVRS 

BoCoV DGVNFTNRFV PVGESFGKSL GNVFCDGVNV TKHKCDINYK GKVFFQFDNL SSEDLKAVRS 

MHV DGVNFRSCCV AEGEVFGKTL GSVFCDGINV TKVRCSAIHK GKVFFQYSGL SAADLVAVTD 

AIBV AKIRFKGFLT EAWAKLLGGD PTDFVAWCYA SCTAKVGDFS DANWLLANLA EHFDADYTNA 

SARS CoV DNTNLHTQLV DMSMTYGQQF GPTYLDGADV TKIKPHVNHE GKTFFVLPSD DTLRSEAFEY 

70 

....|....| ....|....| ....|....| ....|....| .,|....| 

1685 1695 1705 1715 1725 1735 

EMCR LDLTPVID DVDV V KPFRVEGN FSFFDCG VNALDGD-IY 

229E LVNVQKVE QPKI EPKPVSVIKV APKPYRVDGK FSYFTED LLCVADDKPI 

75 PEDV LVDAI FKEAL VDTTPVQEDV QQVSQKPVLP NFEPFRIEGA HAFYECNPEG LMSLGAD-KL 

TGEV 

OV43 SFNFDQKELL AYYNMLVN — CFKWQVWNG KYFTFKQANN NCFVNVSCLM LQSLHLTFKI 

BOCOV SFNFDQKELL AYYNMLVN — CSKWQVVFNG KYFTFKQANN NCFVNVSCLM LQSLNLKFKI 

MHV AFGFDBPQLL KYYNMLG MCKWPWVCG NYFAFKQSNN NCYINVACLM LQHLSLKFHK 

8 0 AXBV FLKKRVSCN 
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SARS CoV 
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pjsav. 

TGEV 
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BoCoV 

MHV 

AIBV 

SARS CoV 



EMCR 
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PEDV 
TGEV 



YHTLDESFLG RYMSALNH— TKKWKFPQVG 6LTSIKWADN MCH.SSVLI.ft LQQLEVKFNA 

1745 17 55 1765 1775 1785* ""i'-Ur 

LLFTNSILML DKQGQL LDTKLNGILQ QAVLDYLATV KTVPAGNLVK LWE-qrTTV 

VLFTDSMLTL DDRGLA LDNALSGVLS AAIKDCVDIN KAIPSGNMK EL L 

VLFTNSNLDF CSVGKC-™ ™SGALL EAINVFKKSN B LDCANMISIT 

XS!!SS AWLEF RSGRPARFVA LVLAKGGFKF GDPADSRDFL RVVFSQVDLT GArCDFF^Ar 
2«f RSGRPARFVS LVLAKGGFKF GDPADSRDFL RVVFSQVDLT GMCDTOIK 
WQWQEAWNEF RSGKPLRFVS LVLAKGSFKF NEPSDSTDFM RWLREADLS GATCDFEFVC 

PALQEAYYRA RAGDAANFCA LILAYSNKTV GELGDVRE™ THLLQHANLE SAKRVLNWC 



1805 
M-CWPSIND 
M-CWPSEKD 
M-WLPFDGD 

— CSIP 

K-CGVKQEQR 
K-CGVKQEQR 
K-CGVKQEQR 
— CGIKSYEL 
KHCGQKTTTL 

• »t 

1865 
MDVNDCFKND 

AEA K 

VER — FYANK 

— VN 

PASVKLPKG-. 
PASVKLPKG- 
PEGKKLPDD- 

LLHFK 

PAEYKLQQGT 
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1815 1825 1835 

LSFDKNLGRC VRKLNRLKTC VIANVPAIDV 
KHLDNNVQRC TRKLNRLMCD IVCTIPADYI 
ANYDKNYARA WKVSKLKGK LVLAVDDATL 



• • • • I I . | | 

1845 1855 
LKKLLSSLTL TVKFVVESNV 
LPLVLSSLTC NVSFVGELKA 
YSKLS — HLS VLGFVSTPDD 



TGLDAVMHFG TLSREDLEIG YTVDCSCG — 
TGVDAVMHFG TLSREDLEIG YTVDCSCG— 
KGVDAVMHFG TLDKGDLAKG YTIACTCG — 
RGLEACIQP 

TGVEAVMYMG TLSYDNLKTG VSIPCVCGR- 



-KKLIHCVRF DVPFLICSNT 
-KKLIHCVRF DVPFLICSNT 
-NKLVHCTQL NVPFLICSNK 

V RATN 

-DATQYLVQQ ESSFVMMSAP 
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1875 
NWLKITEDG 
VITIKVTEDG 
SVVIKVTEDT 

VTEDN 

VGSANIFIG- 
VGSANIFKG- 
VVAANIFTG- 
TQYSNCPTCG 
FLCANEYTGN 



•■»». }»...} 

1925 
TVLSVAPEVD 
ELLTKAIDVD 
WAKWPNAN 
QKVIKAIDID 
NLKQTFKSVL 
NLKQTFKSVL 
NLKQTFSSKL 
TVVFVGSTNS 
ETSYTTTIKP 

I | 

1985 
QYLKPTFKSK 
QYSKPHFISQ 
QFARFRFKSA 
QRLKPQWKFP 
KLIG — HTVC 
KLIG — HTVC 
KLVG--HSIA 
GKSKS-VKED 
KLTCSNTKFA 

I | 

2045 
IVTLEQYSTC 
QVQLEHYSSC 
SVTIERVTHD- 
EIIVTHTTAC 
KPVIWLSHEK 
KPVIWLSHEQ 
KPVIWLGHEE 
SKLPLTLKVR 
KPIVWHINQA 



»...f»*«.| 

1935 
WVAFYGFEKA 
WVEFYGFKDA 
WDSHYGFDKA 
WQAHYGFRDA 
TTYYLDDVKK 
TTYYLDDVKK 
TTFYLDDVKC 
GHCYTQAAGQ 
VSYKLDGVTY 

I I 

1995 
GLNVLWNKFV 
GLDAAWNKFV 
GLQAMWESYC 
GVRGLWNEFL 
DSLNAKLGFD 
DILNAKLGFD 
EKFNAKLGFD 
VSNLATSSKA 
DDLNQMTGFT 



»-.•!•.. 
2055 

DIC 

VECDAKF- 

GCC 

DKC 

ASLNSLT- 
ASLNSLT- 
ASLKSLT- 
GIKS 
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1885 
INVKDVWES 
VNVHDVTVTT 
RSVKAVKVES 
VNHERVSVSF 
DKVGHYVHVK 
DKVGHYVHVK 
GSLGHYTHVK 
ANNTDEVIEA 
YQCGHYTHIT 

• »••(.. .*] 

1945 
ALFASLDVKP 
VTFATVDHSA 
GEFHMLDHTG 
AAFSASSHDA 
IEYKPDLSQY 
IEYKPDLSQY 
VEYNPDLSQY 
AFDNLAKDRK 
TEIEPKLDGY 

I I 

2005 
TGDVGPFVSF 
LGDVEIFVAF 
TGDVAMFVHW 
ERKTQGFVHM 
SSKEFVEYKI 
SSKEFVEYKV 
CNSPFTEYKI 
SFDNLTDFEQ 
KP-ASRELSV 

• •»•)•••. J 

2065 

KSTVV 

KNSVA 

CSKR 

AKVE 

YFNRP 

YFNRP 

YFNRP 

-W 



I | 

1895 
SKSLGKQLG- 
DKSFEQQVG- 
TATYGQQIG- 
DKTYGEQLKG 
CEQSYQLYDA 
CEQSYQLYDA 
CKPKYQLYDA 
SLPYLLLFAT 
AKETLYRIDG 

• <•••(••■ • | 

1955 
YGYPNDFVGG 
FAYESAVVNG 
FTFPSEVVNG 
YKFEVVTHSN 
YCDGGKYYTQ 
YCDGGKYYTQ 
YCESGKYYTK 
FGKKSPYITA 
YKKDNAYYTE 

I | 

2015 
IYFITMSSKG 
VYYVARLMKG 
LYWLTGVDKG 
LYHISGVKKG 
TEWPTATGDV 
TEW PTATGDV 
TEWPTATGDV 
WYDSNIYESL 
TFFPDLNGDV 



TTKTTFKPNT WCLRCLWSTK 



1 | 

2075 
EVKSAWCAS 
SINSAIVCAS 
WTAPWNAS 
KFVGPWAAP 
SLVDDNKFDV 
LLVDENKFDV 
SVVCENKFNV 
DFRSKDGFIY 
PVDTSNSFEV 



. ... | .... | 

1905 
WSDGVDSFE 
VIADKDKDLS 
PCLVNDTWT 
TV VI KDKDVT 
SNVKKVTDVT 
SNVKKVTDVT 
CNVSKVSEAK 
DGPATVDCDE 
AHLTKMSEYK 

\ | 

1965 
FRVLGTTDNN 
IRVLKTSDNN 
RRVIKTTDNN 
FIVHKQTDNN 
RIIKAQFKTF 
RIIKAQFKTF 
PIIKAQFRTF 
MYTRFAFKNE 
QPIDLVP-TQ 

I | 

2025 
QKGDAEEALS 
DKGDAEDTLT 
QPSDSENALN 
EPGDAELMLH 
VLATDDLYVK 
VLATDDLYVK 
VLASDDLYVS 
KVQESPDNFD 
VAIDYRHYSA 

I | 

* 2085 
VLKDGCDVG- 

VKRDGVQVG- 

VLKLGVEDG- ' — " 

LAIHGTDE 

LKVDDVD 

LKVDDVD 

LPVDVSEPTD KGPVPAAVLV 

KLTPDTD 

LAVEDTQGMD N 



I | 

1915 
GVLP — INTD 
GAVPSDLNTS 
DNKP — WAD 
NQLPSAFDVG 
GKLSDCLYLK 
GNLSDCLYLK 
GNFTDCLYLK 

DAVG 

GPVTDVFY-K 

I | 

1975 
CWVNATCIIL 
CWVNAVCIAL 
CWVNVTCLQL 
CWINAICLAL 
EKVDGVYTNF 
EECVDGVYTNF 
EKVEGVYTNF 
TSLPVAKQSK 
PLPNASFDNF 

• • - - I .... | 

2035 
KLSEYLISDS 
KLSKYLANEA 
MLS K Y I V PAG 
KLGDLM DN DC 
RYERGCITFG 
RYERGCITFG 
RYSGGCVTFG 
KYVSFTTKED 
SFKKGAKLLH 



-.1... 
2095 



I 



• • I • • » • | ....).... | 
2105 2115 

— FCPHRH KLRSRVK 

— YCVHGI KYYSRVR 

LCPHGL NYIGKVV 

— TCVHGV SVNVKVT 



* • I . • ■ 

2125 



I 



2135 



• I I | I .... I 

2145 2155 

FVNGRVVIT NVGEPHSQP 

SVRGRAIIV SVEQLEPCAQ 

— -WKGTTIW NVGKPWAPS 
QIKGTVAIT SLIGPIIG— 



OV43 
BoCoV 
MHV 
AIBV 
5 SARS COV 



10 



15 



20 



25 



30 



35 



40 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BOCOV 

MHV 

AIBV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BOCOV 

MHV 

AIBV 

SARS CoV 



EMCR 
229E 
PEDV 
TGEV 
OV43 
BoCoV 
MHV 
AIBV 
. SARS COV 



45 EMCR 

229E 

PEDV 

TGEV 

OV43 
5 0 BOCOV 

MHV 

AIBV 

SARS CoV 

55 

EMCR 

229E 

PEDV 
60 TGEV 

OV43 

BoCoV 

MHV 

AIBV 
65 SARS CoV 



70 



75 



80 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BoCoV 

MHV 

AIBV 

SARS CoV 



DGGDSS ESGAKE TKEINIIKLS GVKKPFKVED 

DGGDIS ESDAKE PKEINIIKLS GVKKPFKVED 

TGALSGAATA PGTAKEQKVC ASDSVVDQVV SGFLSDLSGA TVDVKEVKLN GVKKPIKVED 

ENSKAPVY YPVLDAISLK 

LACESQ QPTSEEWEN PTIQKEVIE CDVKTTEVVG 



..... 1 .... I 

2165 
SKLLNGIAYT 
SRLLSGVAYT 
HLFLKGVSYT 
-EVLEATGYI 
SVIVNDDTSE 
SVIVNDDTSE 
SWVNDPTSE 
AIWVEGNANF 
NVILKPSDEG 



| | 

2175 
TFS — GSFDN 
AFS — GPVDK 
TFLDNGNGVV 
CYS — GSNRN 
TKYVKSLSIV 
IKYVKSLSIV 
TKWKSLSIV 

VVG HP 

VKVTQELGHE 



\ 



2185 
GHYWYDAAN 
GHYTVYDTAK 
GHYTVFDHGT 
GHYTYYDNRN 
DVYDMWLTGC 
DVYDMWLTGC 
DVYDMFLTGC 
NYYSKSLHIP 
DLMAAYVENT 



. ... I .... I ....|. ...I ....I.. ..I 
2225 2235 2245 

S NVPP IVSEKISVMD 

PV NTVKPKPVIN 

v IKDP VKKAELDATK 

KPQAEERPKN CAFNKVAASP KIVQEQKLLA 

SIP 1 DLL NLREIKPAVN 

SIP 1 DLL NLREIKPVFN 

VIP AKLV LLRDEKQEFV 

KPN* LERI FNIAKKAIVG 

PWS KILA YVKPFLG 



I | 

2195 
NAVYDGARLF 
KSMYDGDRFV 
GMVHDGDAFV 
GLVVDAEKAY 
KYWRTANAL 
RCVVRTANAL 
RYWWMANEL 
TFWENAENFV 
SITIKKPNEL 

I 



I 



....!.. 

2205 
ASDLSTLAVT 
KHDLSLLSVT 
PGDLNVSPVT 
HFNRDLLQVT 
SRAVNVPTIR 
SRAVNVPTIR 
SRLVNSPTVR 
KMGDKIGGVT 
SLALGLKTIA 



I | 

2215 
AIVVVGGCVT 
SVVMVGGYVA 
NVVVSEQTAV 
TAIASNFVVK 
KFIKFGMTLV 
KFIKFGMTLV 
EYVKWGMTKI 
MGLWRAEHLN 
THGIAAINSV 



2255 

KLDTG AQ 

QLDEK AQ 

LLDTMNYASE 
IESGANYALT 
WKAVRNKIS 
WKAVRNKIS 
APKWKAKVI 
SSWTTQCGK 
— QAAITTSN 



I I 

2265 
KFFQFGDFVM 
KFFDFGDFLI 
RFFSFGDFMS 
EFGRYADMFF 
VCFNFIKWLF 
ACFNFIKWLF 
ACYSAVKWFF 
LIGKAATFIA 
CAKRLAQRVF 



| | 

2275 
NNIVLFLTWL 
HNFVIFFTWL 
RNLITVFLYI 
MAGDKILRLL 
VLLFGWIKIS 
VLLFGWIKIS 
LYCFSWIKFN 
DKVGGGWRN 
NNYMPYVFTL 



....[ I 

2285 
LSMFSLLRTS 
LSMFTLCKTA 
LSILGLCFRA 
LEV FKYLL VL 
ADNKVIYTTE 
ADNKVIYTTE 
TDNKVIYTTE 
ITDSIKGLCG 
LFQLCTFTKS 



2295 
IMKHDIKVIA 
VTTGDVKIMA 
FRKRDVKVLA 
FMCLRSTKMP 
IASKLTCKLV 
VASKLTCKLV 
VAS KLTFNLC 
ITRGHFERKM 
TNSRIRASLP 



..... I ....! 

2305 
KAPKRTGVIL 
KAPQRTGWL 
GVPQRTGIIL 
KVKVKP-PLA 
ALAFKNAFLT 
ALAFKNAFLT 
CLAFKNALQT 
SPQFLKTLMF 
TTIAKNSVKS 



2315 
TRSFKYNIRS 
KRSLKYNLKA 
RKSMRYNAKA 
FKD FGAKVRT 
FKWSMVARGA 
FKWSWARGA 
FNWNW.SRGF 
FLFYFLKASV 
VAKLCLDAGI 



I 



I I 

2345 
LLYAIYALVF 
LIYTLYSVVL 
GIYALYALLF 
LLIAIYNFFY 
SDFYLPKIGF 
SDFYLPKIGF 
SDFYLPNIGF 
IVWFVYTSNP 
LSICLGSLIC 



I I 

2355 
MIVQFSPFNS 
LCVRFGPFN- 
MTIRFTPIGS 
LFVSIPVVHK 
LPTFVGKIAQ 
LPTFVGKIVQ 
FPTFVGQIVA 
VMFTGIRVLD 
VTAAFGVLLS 



I | 

2365 
LLCGDIVSGY 
-FCSETVNGY 
PVCDDWAGY 
LTCNGAVQAY 
WIKNTFSLVT 
WIKNTFSLVT 
WVKTTFGIFT 
FLFEGSLCGP 
NFGAPSYCNG 



2375 

EKSTFN 

AKSNFV 

ANSSFD 

KNSSFI 

ICDLYSMQDV 
ICDLYSIQDV 
LCDLYQVSDV 
YKDYGK — DS 
VRELYLNSSN 



....[.. 

2325 
ALFVVKQKWC 
SAAVLKSKWW 
LGVFFKLKLY 
LNYMRQLNKP 
CIIATIFLLW 
CIIATIFLLW 
FLVATVFLLW 
KSWASYKTV 
NYVKSPKFSK 

....!....! 



[ | 

2335 
VIVTLFKFLL 
LLAKFTKLLL 
WFKVLGKFSL 
SVWRYAKLVL 
FNFIYANVIF 
FN FI Y AN V I F 
FN FLY ANVIL 
LCKWLATLL 
L FT I AMWLLL 



2385 
-~KDIYCGNS 
— KDDYCDGS 
— KNEYCN-S 
— KSAVCGNS 
GFKNQYCNGS 
GFKNQYCNGS 
GYRSSFCNGS 
FDVLRYCADD 
VTTMDFCEGS 



| | 

2395 
MVCKMCLFSY 
LGCKMCLFGY 
VICKVCLYGY 
ILCKACLASY 
IACQFCLAGF 
I ACQFCLAG F 
MVCELCFSGF 
FICRVCLHDK 
FPCSICLSGL 



I I 

2405 
QEFNDLDHTS 
QELSQFSHLD 
QELSDFSHTQ 
DELADFQHLQ 
DMLDNYKAID 
DMLDNYKAID 
DMLDNYDAIN 
DSLHLYKHAY 
DSLDSYPALE 



1 I 

2415 

LVWKHIR 

VVWKHIT 

VVWQHLR 

VTWDFKS 

WQYEADRRA 
VVQYEADRRA 
VVQHVVDRRV 
SVEQVYKDAA 
TIQVTIS — S 



I [ 

2425 

P— 

D — P — 

D — P — 

D — P — 

FVDYTGVLKI 
FVDYTGVLKI 
SFDYISLFKL 

SG 

YKLDLTILGL 



I | 

2435 
-ILISLQPFV 
-LFSNMQPFI 
-LIGNVMPFF 
-LWNRLVQLS 
VIELIVSYAL 
VIELIVSYAL 
VVELVIGYSL 
— FIFNWNWL 
AAEWVLAYML 



I 1 

2445 
ILVILLIFGN 
VMVLLLIFGD 
YLAFLAIFGG 
YFAFLAVFGN 
YTAWFYPLFA 
YTAWFYPLFA 
YTVCFYPLFG 
YLVFLILFVK 
FTKFFYLLGL 



| | 

2455 
MYLRFGLLYF 
NYLRCFLLYF 
VYVKAITLYF 
NYVRCFLMYF 
LISIQILTTW 
LISIQILTTW 
LIGMQLLTTW 
PVAGFVIICY 
SAIMQVFFGY 



2465 
VAQFISTFG- 
VAQMISTVG- 
IFQYLNSLG- 
VSQYLNLWL- 
LPELFMLST- 
LPELLMLST- 
LPEFFMLET- 
CVKYLVLNST 
FASHFISN— 



....|....! 

2475 
-SFLGFHQKQ 
-VFLGYKETN 
-VFLGLQQSI 
-SYFGYVEYS 
-LHWSFRLLV 
-LHWSVRLLV 
-MHWSARFFV 
VLQTGVCFLD 
— SWLMWFII 



I 



....|.. 

2485 
WFLHFVPFDV 
WFLHFI PFDV 
WFLQLVPFDV 
WFLHVVNFES 
2VLANMLPAHV 
SLANMLPAHV 
FVANMLPAFT 
WFVQTVFSHF 
SIVQMAPVSA 



....(....! 

2495 
LCNEFLATFI 
ICDELLVTVI 
FGDEIWFFI 
ISAEFVIVVI 
FMRFYIIIAS 
FMRFYIIIAS 
LLRFYIVVTA 
NFMGAGFYFW 
MVRMYIFFAS 



....I.... | 

2505 
VCKIVLFVRH 
VIKVISFVRH 
VTRVLMFIKH 
VVKAVLALKH 
FIKLFSLFRH 
FIKLFSLFRH 
MYKIFCLCRH 
LFYKIYIQVH 
FYYIWKSYVH 



2515 
IIVGCNNADC 
VLFGCENPDC 
VCLGCDKASC 
IVFACSNPSC 
VAYGCSKSGC 
VAYGCSKSGC 
VMYGCSRPGC 
HILYCKDVTC 
IMDGCTSSTC 



2525 



2535 



I 



..|... 
2545 



I 



2555 



I 



..|... 
2565 



I 



2575 
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EMCR 

229E 

PEDV 

TGEV 

OV43 

BOCOV 

MHV 

AIBV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BoCoV 

MHV 

AIBV 

SARS CoV 



EMCR 

229E 

PBDV 

TGEV 

OV43 

BoCoV 

MHV 

AIBV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OV43 . 

BoCoV 

MHV 

AIBV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BoCoV 

MHV 

AIBV 

SARS CoV 



EMCR 
229E 
PEDV 
TGEV 
OV43 
BoCoV 

. my - 

AIBV 
SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BoCoV 

MHV 

AIBV 



VACSKSARLK 
IACSKSARLK 
VACSKSARLK 
KTCSRTARQT 
LFCYKRNRSL 
LFCYKRNRSL 
LFCYKRNRSV 
EVCKRVARSN 
MMCYKRNRAT 

• ••• |* a >*| 

2585 
ARELGNVVKT 
SRELGNITKT 
ATEVGNVVKL 
VRDLSNSVKQ 
ALDLSKELKR 
ALDLSKELKR 
AADLSKELKR 
AGELSEKLKR 
ARDLSLQFKR 

I | 

2645 
-VLKNCNVLE 
-VFKNCNVLD 
-ALKNCSIIT 
-VLKSMLLLD 
SKVKSVPNMH 
SKVKSVPNMH 
SKVKGVPETH 
KAVFLKEALK 
NNTKGSLPIN 



RVPLQTIING 
RFPVNTIVNG 
RVPVQTIFQG 
RIPIQVWNG 
RVKCSTIVGG 
RVKCSTIVGG 
RVKCSTWGG 
RQEVSWVGG 
RVECTTIVNG 

2595* " 1 
AVQPTAPAYV 
NVQPTGPAYV 
NVQPTGPATI 
TVYATDRSHQ 
PIQPTDVAYH 
PIQPTDVAYH 
PVNPTDSAYY 
HVKPTAYAYH 
PINPTDQSSY 



MHKSFYVNAN 
VQRSFYVNAN 
TSKSFYVHAW 
SMKTVYVHAN 
MIRYYDVMAN 
MIRYYDVMAN 
TLRYYDVMAN 
RKQIVHVYTN 
MKRSFYVYAN 

• | 

2605 
IIDKVDFVNG 
MIDKVEFENG 
LIDKVEFSNG 
EVTKVECSDG 
TVTDVKQVGC 
TVTDVKQVGC 
LVTEVKQVGC 
VVDEACLVDD 
IVDSVAVKNG 



GGTCFCNKHN 
GGSKFCKKHR 
GGSKFCKKHN 
GTGKFCKKHN 
GGTGFCSKHQ 
GGTGFCSKHQ 
GGTGFCAKHQ 
SGYNFCKRHN 
GGRGFCKTHN 



| | 

2615 
FYRLYSGDTF 
FYRLYSCETF 
FYYLYSGDTF 
FYRFYVGDEF 
SMRLFYDRDG 
YMRLFYDRDG 
SMRLFYERDG 
FVNLKYKAAT 
ALHLYFDKAG 



FFCVNCDSFG 
FFCVDCDSYG 
FFCLNCDSYG 
FYCKNCDSYG. 
WNCIDCDSYK 
WNCIDCDSYK 
WNCLNCSAFG 
WYCRNCDDYG 
WNCLNCDTFC 

I | 

2625 
WRYDFDITES 
WRYNFDITES 
WKYNFDITDS 
TSYDYDVKHK 
QRTYDDVNAS 
QRTYDDVNAS 
QRVYDDVSAS 
PGKDSASSAV 
QKTYERHPIiS 



PGNTFINGDI 
YGSTFITPEV 
PGCTFINDVI 
FENTFICDEI 
PGNTFITVEA 
PGNTFITVEA 
PGNTFITHEA 
HQNTFMSPEV 
TGSTFISDEV 

|- | 

2635 

KYSCKE 

KYSCKE 

KYTCKE 

KYSSQE 

LFVDYSNLLH 
LFVDYSNLLH 
LFVDMNGLLH 
KCFSVTDFLK 
HFVNLDNLRA 



• »••!«»••) • > i • | • . , , | 
2655 2665 

NFIVYNN SGSNI 

DFIVFNN NGTNV 

DFIVFNN NGSNV 

DFIVYSP SGSAL 

VVWEN DADK 

WVVEN DADK 

VVWEN EADK 

CEQISNDGFI VCNTQSAHAL 
VIVFDGK SKCDE 



2705 
— VDFNGVLH 
— VDFNGVLH 
— VDFGASLH 
— rVDFKGALF 
TGTSVTETMF 
TGTSVTETMF 
TGLSVSQTMF 
V-EPVSKSVI 
DSTEVSVKMF 

•»».)■•*, | 
2765 

VSDDD 

ISDHE 

VPLDT 

VSFST 

SIDSDVDTKC 
SIDSDVDTKC 
AIDSDVETKS 

ITKDEE 

WDTDVDTKD 

...«!»...| 

2825 
VNHNVLIKES 
VNANVLTKDQ 
VNHNVLVKDS 
VNAKVLTQRG 
VQGNVAKIAG 
VQGNVAKI AG 
. VQAN.VAKAAN 
ANLRVKN — A 
INAQVAKSHN 



. * * • | • ■ • • | 

2715 
KAYVDVLCNS 
KAYIDVLRNS 
SAFVSVLSNS 
NAKKNVIKNS 
DVYVDTFLSM 
D VYVDT FLSM 
DLYVDSLLGV 
DKVCSILSSI 
DAYVDTFSAT 

I I 

2775 
FVSAVANAHR 
FTSAISNAHR 
FNAAVAEAHR 
FEMAVNNAHR 
LADSVMSAVS 
LADSVMSAVS 
ITKSIMSAVN 
AVDMAIFCHN 
VIECLKLSHH 

I | 

2835 
IPIVWGVKDF 
TPIVWHAKDF 
IPVVWLVRDF 
KSWWLSQDF 
VSCIWSVDAF 
VSCIWSVDAF 
VACIWSVDAF 
PPWWKFSEL 
VSLIWNVKDY 



I 



2885 

ATSIVAKQGA G 

ATSIVAKQGA GD 

TVCIANKKGA GLPS- 

TESVSPKSGS G 

— TPFSLKGG A — V- 
— TPFSLKGG A — V- 
— TPFSLKGG A — V- 



2895* 



I 



2725 
FFKELTANMS 
FGKDLNANMS 
FGKDLSSCND 
FNVDVSECKN 
FDVDKKS LNA 
FDVDKKSLNA 
LDVDRKSLTS 
ISVDTAALNY 
FSVPMEKLKA 

.... | .... | 

2785 
YDVLLSDLSF 
CDVLLSDLSF 
YDVLLTDMSF 
FGILITDRSF 
AGLELTDESC 
AGLELTDESC 
AGVDFTDBSC 
HDVDYTGDGF 
SDLEVTGDSC 

I | 

2845 
NTLSQEGKKY 
NSLSAEGRKY 
IALSEETRKY 
AALSSTAQKV 
NQFSSDFQHK 
NQLSSDFQHK 
-NQLSADLQHR- 
IKLSDSCLKY 
MSLSEQLRKQ 

2905* " ! 
— FKRTYNFL 
— AGHSLTWL 
— FSKVKKFF 
— FFDVITQL 
— FSYFVYVC 
— FSYFVYVC 
FSKVLQWL 



■ * * * ' — l — l • • • • l • • * • I 

2675 2685 2695 

TQIKNACVYF SQLLCEPIKL VNSELLSTLS 
TQVKNASVYF SQLLCRPIKL VDSELLSTLS 
NQVKNACVYF SQMLCKPVKL VDSALLASLS 
ANVRNACVYF SQLIGKPIKI VNSDLLEDLS 
ANFLNAAVFY AQSLFRPILM VDKNLITTAN 
AN FLNAAVF Y AQSLFRPILM VDKILITTAN 
AGFLNAAVFY AQSLYRPMLL VEKKLITTAN 
EEAKNAAIYY AQYLCKPILI LDQALYEQLV 
SASKSASVYY SQLMCQPILL LDQVLVSDVG 

1 1 I I 

2735 2745 2755 

-MAECKATLGL T '. 

LAECKRALGL S 

MQDCKSTLGF DD 

LDECYRACNL N 

LIATAHSSIK QGTQIYKVLD TFLSCARKSC 
LIATAHSSIK QGTQICKVLD TFLSCARKSC 
FVNAAHNSLK EGVQLEQVMD TFIGCARRKC 

KAGTLRDALL S 

LVATAHSELA KGVALDGVLS ' TFVSAARQG- " 

* ■ • 1 1 1 I I 

2795 2805 2815 

NNFFISYAKP EDK-LSVYDI ACCMRAGSKV 
NNFVSSYAKP EEK-LSAYDL ACCMRAGAKV 
NNFTTSYAKP EEK-FPVHDI ATCMRVGAKI 
NNFWPSKVKP GSSGVSAMDI GKCMTS DAK I 
NNLVPTYLKS DN — IVAADL GVLIQNSAKH 
NNLVPTYLKG DN — IVAADL GVLIQNSAKH 
NNLVPTYVKS DT — IVAADL GVLIQNNAKH 
TNVIPSYGID TG-KLTPRDR GFLINADASI 
NNFMLTYNKV EN — MTPRDL GACIDCNARH 

I | | | I .... | 

2855 2865 2875 

LVKTTKAKGL TFLLTFNDNQ AITQVP 

IVKTSKAKGL TFLLTINENQ AVTQIP 

IIRTTKVKGI TFMLTFNDCR MHTTIP 

LVKT FVEEGV NFSLTFNAVG SDDDLPYERF 

LKKACCKTGL KLKLTYNKQM ANVSVLT 

LKKACCKTGL KLELTYNKQM ANVSVLT 

LRK ACS KTGL" KIKETYNKQE ANVPILT---" 

LISATVKSGV RFFITKSGAK QVIACHT 

IRSAAKKNNI PFRLTCATTR QWNVIT 



QKLLVEKKAG GIVSGTFKCF KSYFKWLLIF 



* • • • | . • . . ] 

2915 
WYVCLFWAL 
WLLCGLVCLI 
WFLCLFIVAA 
KQIVILVFVF 
FVLSLVCFIG 
FVLSLVCFIG 
FVVNLICFIV 
YILFTACCSG 



....|....| 

2925 
FIGVSFID — 
QFYLCFFMPY 
FFALSFLD — 
IFICGLCSVY 

LWCLMPT 

LWCLMPT 

LWALMPT 

YYYMEVSKSF 



I I 

2935 

YTTTVTS 

— FMYDIVSS 

FSTQVSS 

SVATQSYIES 
YTVHKSDFQL 
YTVHKSDFQL 
YAVHKSDMQL 
VHPMYDVNST 



SARS COV 



5 EMCR 
229E 
PEDV 
TGEV 
0V43 
1 0 BoCoV 
MHV 
AIBV 
SARS CoV 

15 

EMCR 

229E 

PEDV 
2 0 TGEV 

OV43 

BoCoV 

MHV 

AIBV 
25 SARS CoV 



-TKISLKGG K — I VSTCFKLM LKATLLCVLA ALVCYIVMPV HTLSIHDGYT 



30 



35 



40 
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50 



55 



60 



65 



70 



75 



80 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BoCoV 

MHV 

AIBV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV. 

OV43 

BoCoV 

MHV 

AIBV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BoCoV 

MHV 

AIBV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BoCoV 

MHV 

AIBV 

SARS CoV 



EMCR 
229E 
PEDV 
TGEV 



I 



2945 
FHGYDFKYIE 
FEGYDFKYIE 
DSDYDFKYIE 
AEGYDYMVIK 
PVYASYKVLD 
PVYASYKVLD 
PLYASFKVID 
LHVEGFKVID 
NEIIGYKAIQ 



2955 
NGQLKVFEAP 
NGQLKNFEAP 
SGQLKTFDNP 
NGIVQPFDDT 
NGVIRDVSVE 
NGVIRDVSVE 
NGVLRDVTVT 
KGVLREIVPE 
DGVTRDIIST 

I 



2965 
LHCVRNVFDN 
LKCVRNVFEN 
LSCVHNVFIN 
ISCVHNTYKG 
DVCFANKFEQ 
DVCFANKFEQ 
DACFANKFIQ 
DTCFSNKFVN 
DDCFANKHAG 



.-..|-.-. I 

2975 
FNQWHEAKFG 
FEDWHYAKFG 
FDQWHDAKFG 
FGDWFKAKYG 
FDQWYESTFG 
FDQWYESTFG 
FDQWYESTFG 
FDAFWGRPYD 
FDAWFSQRGG 



2985 
WTTNSDKCP 
FTPLNKQSCP 
FTPVNNPSCP 
FIPTFGKSCP 
LSYYSNSMAC 
LSYYSNSMAC 
LVYYRNSRAC 
NSRNCPIVTA 
S — YKNDKSC 



I I 

2995 

IVVG VS 

IVVG VS 

IWG VS 

IVVGT-VFDL 
PIVVA-VIDQ 
PIVVA-VVDQ 
PVWA-VIDQ 
VI DGDGT VAT 
PVVAA-IITR 



3005 
ERINVVPGVP 
EIVNTVAGIP 
DEARTVPGIP 
ENMRPIPDVP 
DFGSTVFNVP 
DFGSTVFNVP 
DIGYTLFNVP 
GVPGFVSWVM 
EIGFIVPGLP 

| 1 

3065 
CTRLEGLGGD 
CTRLEGLGGN 
CTTLSGLGGT 
CTTLTGLGGT 
CTMFTMADGS 
CTMFAMADGS 
CTMLAHADGT 
CLYLTASNTP 
CTIFKDAMGK 



3015 

TNVYLVG 

SNVYLVG 

AGVYLAG 

AYVSIVG 

TKVLRYG 

TKVLRYG 

TKVLRYG 

DGVMFIHMTQ 
GTVLRAIN — 



I I 

3025 
-KTLVFTLQA 
-KTLIFTLQA 
-KTLVFAINT 
-RSLVFAINA 
YHVLHFITHA 
YHVLHFITHA 
FHVLHFITHA 
TERKPWYIPT 
GDFLHFLPRV 



I - 



| [ 

3125 
RTLATRYCRV 
RTIATKYCRV 
RTKAMTYCRV 
KTQATTYCRV 
RTRSMSYCRV 
RTRSMSYCRV 
RTRSMTYCRV 
KFVSDSYCRG 
TTFDAEYCRH 



3075 
-NVYCYNTDL 
-NVYCYNTAL 
-AVYCYKNGL 
-IVYCAKQGL 
PQPYCYTEGL 
PQPYCYTDGL 
PHPYCYTEGI 
QLYCFNGDND 
PVPYCYDTNL 

• ••*•!••••( 

3135 
GECRDSHKGV 
GECVESNAGV 
GQCVQSAEGV 
GECIDSKAGF 
GLCEEADEGI 
GLCEEADEGI 
GLCEDAEEGV 
SVCEYTRPGY 
GTCERSEVGI 



. I .... I 
3085 
IEGSKPYSIL 
MEGSLPYSSI 
VEGAKLYSEL 
VEGAKLYSDL 
MQNASLYSSL 
MQNASLYSSL 
MHNASLYDSL 
APGAIjPFGSI 
LEGSISYSEL 



. . . . I I 

3035 
AFGNTGVCYD 
AFGNAGVCYD 
IFGTSGLCFD 
AFGVTNMCYD 
LSADGVQCYT 
LSADGVQCYT 
FATDSVQCYT 
WFNREIVGYT 
FSAVGNICYT 

3095 
QPNAYYKYDV 
QANAYYKYDN 
APHSYYKMVD 
MPDYYYEHAS 
VPHVRYNIiAN 
VPHVRYNLAN 
APHVRYNLAN 
IPHRVYFQPN 
RPDTRYVLMD 



I I 

3045 

FDGVTTS 

I FGVTTP 

ASGVADK 

HTGNAVSKDS 
PHSQISYSNF 
PHSQISYSNF 
PHMQIPYDNF 
QDSIITEGSF 
PSKLIEYSDF 

3105* * 1 
K-NYVRFPEI 
G-NFIKLPEV 
G-NAVSLPEI 
G-NMVKLPAI 
AKGFIRFPEV 
AKGFIRLPEV 
SNGYIRFPEV 
GVRLIVPQQI 
G-SIIQFPNT 



I | 

3055 
— DKCIFNSA 
— EKCIFTSA 
— GACIFNSA 
YFDTCVFNTA 
YASGCVLSSA 
YASGCVLSSA 
YASGCVLSSL 
YTSIALFSAR 
AT S AC VIi AAE 

" *3115* 1 

LARGFGLRT I 

IAQGFGFRTV 

ISRGFGIRTI 

IR-GLGLRFV 

LREGL-VRIV 

LREGL-VRIV 

VSEGI-VRIV 

LHTPY VV 

YLEGS-VRVV 



I I 

3145 
CFGFDKWYVN 
CFGFDKWFVN 
C FGADRFF V Y 
CFGGDNWFVY 
CFNFNGSWVL 
CFNFNGSWVL 
CFNFNSSWVL 
CVSLNPQWVL 
CLSTSGRWVIi 



....| I 

3155 

DGRVD DG 

DGRVA NG 

NAESG SD 

DNEFG NG 

NNDYYRSLPG 
NNDYYRSLPG 
NNPYYRAMPG 
FNDEYTSKPG 
NNEHYRALSG 



....I I 

3165 
YICGDGLIDL 
YVCGTGLWNL 
FVCGTGLFTL 
YICGNSVLGF 
TFCGRDVFDL 
TFCGRDVFDL 
TFCGRNAFDL 
VFCGSTVREL 
VFCGVDAMNL 



I I 

3175 
LVNVLSIFSS 
VFNILSMFSS 
LMNVISVFSK 
FKNVFKLFNS 
IYQLFKGLAQ 
IYQLFKGLAQ 
IHQVLGGLVR 
MFSMVSTFFT 
IANIFTPLVQ 



3185 
SFSVVAMSGH 
SFSVAAMSGQ 
TVPVTVLSGQ 
NMSWATSGA 
PVDFLALTAS 
PVDFLALTAS 
PIDFFALTAS 
GVN-PNIYMQ 
PVGALDVSAS 

... . 1 I 

3245 
N-LFFMLLYA 
N-LVTMIAYA 
N-TLGMLGYA 
N-TFFMIIYA 
VYPILSCVYA 
VYPTLSCVYA 
VYPTLSCLYA 
YNSVLAVILL 
AYSFLPGVYS 



3195 
MLFNFLFAAF 
ILLNCALGAF 
ILFNCIIAFV 
MLVNIIIACL 
SIAGAILAVI 
SIAGAILAVI 
SVAGAILAII 
LATMFLILVV 
VVAGGIIAIL 

I i 

3255 

ilyfvftrtv 
ilyffatrsl 
tlyflctkgv 
ivyyfitrkl 
icyfyatlyf 
icyfyatlyf 
cfyfyttlyf 
vlycyaslvt 

VFYLYLTFYF 



3205 
ITFLCFLVTK 
AIFCCFLVTK 
AVAVCFLFTK 
AIAMCYGVLK 
VVLVFYYLIK 
VVLGFYYLIK 
WLAFYYLIK 
VVLIFAMVIK 
VTCAAYYFMK 

• •••I • • » » | 

3265 
R — YAWIWHI 
R — YAWIWCA 
R — YMWIWHL 
A — YPGILDA 
PSEISVIMHL 
PSEISVIMHL 
PSEISWMHL 
SRNTVIIMHC 
TNDVSFLAHL 



I I 

3215 
FKRVFGDLSY 
FRRMFGDLSV 
FKRMFGDMSV 
FKKIFGDCTF 
LKRAFGDYTS 
LKRAFGDYTS 
LKRAFGDYTS 
FQGVFKAYAT 
FRRVFGEYNH 

I I 

3275 
AYIVAYFLLI 
AYLIAYISFA 
GFLISYILIA 
GFIIAYINMA 
QWLVMYGTIM 
QWLVMYGTIM 
QWLVMYGAIM 
WLVFTFGLIV 
QWFAMFSPIV 



I . I 

3225 
GVFTVVCATL 
GVCTVWAVL 
GVFTVGACTL 
LIVMIIVTLV 
WFVNVIVWC 
IVFVNVIVWC 
VWINVIVWC 
TVFITMLVWV 
VVAANALLFL 

....I.... I 

3285 
PWWLLTWFSF 
PWWLCAWYFL 
PWWVLMVYAF 
PWYVITAYIL 
PLWFCLLYIA 
PLWFCLLYIS 
PLWFCIIYVA 
PTWLACCYLG 
PFWITAIYVF 



3235 
INNISYWTQ 
LNNVSYIVTQ 
LNNVSYIVTQ 
VNNVSYFVTQ 
VNFMMLFVFQ 
VNFMMLFVFQ 
INFLMLFVFQ 
INAFILCVHS 
MSFTILCLVP 

...-! 1 

3295 
AAFLELLPNV 
AMLTGLLPSL 
SAIFEFMPNL 
VFLYDSLPSL 
WVSNHAFWV 
VVVSNHAFWV 
WVSNHALWL 
FIIYMYTPLF 
CISLKHCHWF 



3305 
FKLKISTQ — 
LKLKVSTN — 
FKLKVSTQ — 
FKLKVSTN— 



. ... | .... | . . . . | 1 ....|....| 

3315 3325 3335 3345 3355 

LFEGDKFI GTFESAAAGT FVLDMRSYER LINT — ISPE KLKNYAASYN 

LFEGDKFV GTFESAAAGT FVIDMRSYEK LANS — ISPE KLKSYAASYN 

LFEGDKFV GSFENAAAGT FVLDMHAYER LANS— ISTE KLRQYASTYN 

—LFEGDKFV GNFESAAMGT FVIDMRSYET IVNS— TSIA RIKSYANSFN 
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OV43 
BoCoV 
MHV 
AIBV 
SARS CoV 



EMCR 

229B 

PEDV 

TGEV 

OV43 

BoCoV 

MHV 

AIBV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BoCoV • 

MHV 

AIBV 

SARS CoV 



.EMCR 

229E 

PEDV 

TGEV 

OV43- 

BoCoV 

MHV 

AIBV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BOCOV 

MHV 

AIBV 

SARS COV 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BoCoV 

MHV 

AIBV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BoCoV 

MHV 

AIBV 

SARS CoV 



FSYCRKLG — 
PSYCRQLG — 
FSYCRKLG — 
LWCYGTTKNT 
FNNYLRKR — 



— TSVR— SD GTFEEMALTT 
— TSVR— SD • GTPEEMALTT 
— TEVR— SD GTFEEMSLTT 
RKLYDGNEFV GNYDLAAKST 
— VMFNGVTF STFEEAALCT 



FMITKDSYCK LKNS — LSDV 
FMITKDSYCK LKNS — LSDV 
FMITKESYCK LKNS— VSDV 

FVIRGSEFVK LTNE IGD 

FLLNKEMYLK LRSETLLPLT 



AFNRYLSLYN 
AFNRYLSLYN 
AFNRYLSLYN 
KFEAYLSAYA 
QYNRYLALYN 



3365 
KYKYYSGSAS 
RYKYYSGNAN 
KYKYYSGSAS 
KYKYYTGSMG 
KYRYYSGKMD 
KYRYYSGKMD 
KYRYFSGKMD 
RLKYYSGTGS 
KYKYFSGALD 



....|....| 

3375 
EADYRCACYA 
EADYRCACYA 
EADYRLACFA 
EADYRMACYA 
TAAYREAACS 
TAAYREAACS 
TAAYREAACS 
EQDYLQACRA 
TTSYREAACC 



-...I. ...| 

3425 
PSGCVERCVV 
PSGFVEKCW 
PSGWEKCIV 
PSGLVEPCIV 
PTSKVEPCVV 
PTSKVEPCIV 
PTSKVEPCVV 
PSSAVEKCIV 
PSGKVEGCMV 

3485 
SHNG-VFLGV 
ISGT-AFLGV 
SSGN-VFLGV 
SKNN-VFLGV 
LFDR-LSLTV 
LFDR-LSLTV 
MSGR-MSLTV 
TTQHGVTLNV 
QAGN-VQLRV 

....I. ...| 

3545 
VNLRTNFTIK 
VNMRTNWTIR 
VNMRSNYTIR 
VNMRSQGTIK 
VTMRSSYTIK 
VTMRSSYTIK 
VTMRSSYTIK 
VTMRSNGTIR 
CAMRPNHTIK 

I | 

3605 
DQPSLQVESA 
DQPNLQVESA 
DQPTLQVEGA 
DQPSMQLEGT 
DAQWQLLIQ 
DAQWQLPVQ 
DAQVVQLPVQ 
DEEVAQRVPP 
DRQTAQAAGT 

3665 
IVSSVEC- — Y 
AMNGEDA — F 
TVGNTDC — F 
ELSSTDA--F 
QVKSDLV — I 
QVKSDLV — I 
SIKADLV — L 
PFSTSTA — I 
PLTQDHVDIL 



....|....| 

3435 
RVCYGSTVLN 
RVCYGNTVLN 
RVCYGNMALN 
RVSYGNNVLN 
SVTYGNMTLN 
SVTYGNMTLN 
SVTYGNMTLN 
SVSYRGNNLN 
QVTCGTTTLN 

I | 

-3495 
VGVTMHGSVL 
VGATMHGVTL 
VSATMRGALL 
VSARYKGVNL 
MSYQMRGCML 
MSYQMQGCML 
MSYQMQGSLL 
VSRRLKGAVL 
IGHSMQNCLL 

I I 

3555 
GSFINGACGS 
GSFINGACGS 
GSFINGACGS 
GSFIAGTCGS 
GSFLCGSCGS 
GSFLCGSCGS 
GSFLCGSCGS 
ASFLAGACGS 
GSFLNGSCGS 



I .... I 

3385 
HLAKAMLDYA 
YLAKAMLDFS 
HLAKAMMDYA 
HLGKALMDYS 
QLAKAMDTFT 
QLAKAMDTFT 
QLAKAMETFN 
WLAYALDQYR 
HLAKALNDFS 

I | 



3395 
-KDHNDMLYS 
-RDHNDILYT 
-SNHNDTLYT 
-VNRTDMLYT 
NNNGSDVLYQ 
NNNGSDVLYQ 
HNNGNDVLYQ 
-NSGVEIVYT 
-NSGADVLYQ 



3405 " 
PPTISYN-ST 
PPTVSYG-ST 
PPTVSYN-ST 
PPTVSVN-ST 
PPTASVSTSF 
PPTASVSTSF 
PPTASVTTSF 
PPRYSIGVSR 
PPQTSITSAV 



3415 
LQSGLKKMAQ 
LQAGLRKMAQ 
LQAGLRKMAQ 
LQSGLRKMAQ 
LQSGIVKMVN 
LQSGIVKMVN 
LQSGIVKMVF 
LQSGFKKLVS 
LQSGFRKMAF 



3445 
GVWLGDTVTC 
GLWLGDIVYC 
GLWLGDIVMC 
GLWLGDEVIC 
GLWLDDKVYC 
GLWLDDKVYC 
GLWLDDKVYC 
GLWLGDTIYC 
GLWLDDTVYC 

I | 

3505 
RIKVSQSNVH 
KIKVSQTNMH 
QIKVNQNNVH 
VLKVNQVNPN 
VLTVTLQNSR 
VLTVTLQNSR 
VLTVTLQNPN 
ILQTAVANAE 
RLKVDTSNPK 

I I 

3565 
PGYNVRNDGT 
PGYNLKN-GE 
PGYNINN-GT 
VGYVLEN-GI 
VGYVIMG-DC 
VGYVIMG-DC 
VGYVLTG-DS 
VGFNIEK-GV 
VGFNIDY-DC 



3455 
PRHVIAPSTT 
PRHVIASNTT 
PRHVIASSTT 
PRHVIASDTT 
PRHVICSASD 
PRHVICSASD 
PRHVICSSAD 
PRHVLGKFSG 
PRHVICTAED 

3515* " 1 
TPKHVFKTLK 
TPRHS FRTLK 
TPKYTYRTVR 
TPEHKFKSIK 
TPKYTFGWK 
TPKYTFGWK 
TPKYSFGVVK 
TPKYKFIKAN 
TPKYKFVRIQ 



3465 
VL-IDYDHAY 
SA-IDYDHEY 
ST— IDYDYAL 
RV-INYENEM 
MTNPDYTNLL 
MTNPDYTNLL 
MTDPDYSNLL 

DQ WNDVL 

MLNPNYEDLL 

! 1 

3525 
PGASFNILAC 
SGEGFNILAC 
PGESFNILAC 
AGES FNI LAC 
PGETFTVLAA 
PGETFTVLAA 
PGETFTVLAA 
CGDSFTIACA 
PGQTFSVLAC 



I | 

3475 
STMRLHNFSV 
SIMRLHNFSI 
SVLRLHNFSI 
SSVRLHNFSV 
CRVTSSDFTV 
CRVTSSDFTV 
CRVISSDFCV 
NLANNHEFEV 
IRKSNHSFLV 

I | 

3535 
YEGIASGVFG 
YDGCAQGVFG 
YDGAAAGVYG 
YEGCPGSVYG 
YNGKPQGAFH 
YNGKPQGAFH 
YNGKSQGAFH 
YGGTVVGLYP 
YNGSPSGVYQ 



I ! 

3575 
VEFCYLHQIE 
VEFVYMHQIE 
VEFCYLHQLE 
LYFVYMHHLE 
VKFVYMHQLE 
VKFVYMHQLE 
VRFVYMHQLE 
VNFFYMHHLE 
VSFCYMHHME 



3615 
NLMLSDNVVA 
NQMLTVNWA 
SSLFTENVLA 
NVMSSDNWA 
DYIQSVNFVA 
DYIQSVNFVA 
DYTQTVNVVA 
DNLVTNNIVA 
DTTITLNVLA 



• | ... •]•«.. | 

3625 3635 

FLYAALLNGC R WWL 

FLYAAILNGC T WWL 

FLYAALINGS T WWL 

FLYAALINGE R WFV 

WLYAAILNNC N WFV 

WLYAAILNNC N WFV 

WLYAAILNRC N WFV 

WLYAAIISVK ESSFSLPKWL 
WLYAAVINGD R WFL 



I v — | 
3675 
SILAAKTGVS 
SILAAKTGVC 
SILAAKTGVD 
SMLAAKTGQS 
DALASMTGVS 
DALASMTGVS 
DALASMTGVT 
TKLSAITGVD 
GPLSAQTGIA 



-r;y.T.;.-.T 

3685 
VEQLLASIQH 
VERLLHAIQV 
VQRLLASIQS 
VEKLLDSIVR 
LETLLAAIKR 
LETLLAAIKR 
VEQILAAIKR 
VCKLLRTIMV 
VLDMCAALKE 



. . I ► . . 
3725 



3735 



..|... 
3745 



1 



3695 
LHE-GFGGKN 
LNN-GFGGKQ 
LHK-NFGGKQ 
LNK-GFGGRT 
LKN-GFQGRQ 
LKN-GFQGRQ 
LYS-GFQGKQ 
KNS-QWGGDP 
LLQNGMNGRT 

....I.... | 
3755 



••»•{.. 

3585 
LGSGAHVGSD 
LGSGSHVGSS 
LGSGCHVGSD 
LGNGSHVGSN 
LSTGCHTGTD 
LSTGCHTGTD 
LSTGCHTGTD 
LPNALHTGTD 
LPTGVHAGTD 

• • - • I I 

3645 
RSTRVNVDGF 
KGEKLFVEHY 
SSSRIAVDRF 
TNTSMSLESY 
QSDKCSVEDF 
QSDKCSVEDF 
QSDSCSLEEF 
ESTTVSVDDY 
NRFTTTLNDF 

•••*!••••) 

3705 
ILGYSSLCDE 
ILGYSSLNDE 
ILGHTSLTDE 
ILSYGSLCDE 
IMGSCSFEDE 
IMGSCSFEDE 
ILGSCVLEDE 
ILGQYNFEDE 
ILGSTILEDE 



I | 

3595 
FTGSVYGNFD 
FDGVMYGGFE 
LDGVMYGGYE 
FEGEMYGGYE 
FNGDFYGPYK 
FNGDFYGPYK 
FSGNFYGPYR 
LMGEFYGGYV 
LEGKFYGPFV 

I I 

3655 
NBWAMANGYT 
NEWAQANGFT 
NEWAVHNGMT 
NTWAKTNSFT 
NVWALSNGFS 
NVWALSNGFS 
NVWAMTNGFS 
NKWAGDNGFT 
NLVAMKYNYE 

•••»|..**| 

3715 
FTLAEVVKQM 
FSINEWKQM 
FTTGEWRQM 
FTPTEVIRQM 
LTPSDVYQQL 
LTPSDVYQQL 
LTPSDVYQQL 
LTPESVFNQI 
FTPFDVVRQC 



3765 



I 



-.1... 
3775 



I 



10 



FMPR YGVNLQS GKVIFGLKTM FLFSVFFTMF WAELFIYTNT IWINPVILTP IFCLLLFLSL 

79 9E FGVNLQS GKTTSMFKSI SLFAGFFVMF WAELFVYTTT IWVNPGFLTP FHI LLVALSL 

PFDV YGVNLQG GYVSRACRNV LLVGSFLTFF WSELVSYTKF FWVNPGYVTP MFACLSLLSS 

TTEV YGVNLGA GKVKSFFYPI MTAMTILFAF WLEFFMYTPF TWINPTFVSI VLAVTTLIST 

OV43 AGIKLQSKRT RLFKGTVCWI MASTFLFSCI ITAFVKWTMF MYVTTNMFS- ITFCALCVIS 

BoCoV AGIKLQSKRT RLVKGIVCWI MASTFLFSCI ITAFVKWTMF MYVTTNMLS- ITFCALCVIS 

MHV AGVKLQSKRT RWKGTCCWI LASTLLFCSI ISAFVKWTMF MYVTTHMLG- VTLCALCFVS 

AIBV GGVRLQS SFVRKATSW FWSRCVLACF LFVLCAIVLF TAVPLKFYVY AAVILLMAVL 

SARS CoV SGVTFQGKFK KIVKGTHHWM LLTFLTSLLI LVQSTQWSLF FFVYENAFLP FTLGIMAIAA 

1 .... I ....|....| I I I I 

3785 3795 3805 3815 3825 3835 

EMCR VLTMFLKHKF LFLQVFLLPT VIATALYN CVLDYYIV KFLADHFN-Y NVSVLQMDVQ 

229E CLTFVVKHKV LFLQVFLLPS IIVAAIQN CAWDYHVT KVLAEKFD-Y NVSVMQMDIQ 

1 C PEDV LLMFTLKHKT LFFQVFLIPA LIVTSCIN LAFDVEVY NYLAEHFD-Y HVSLMGFNAQ 

TGEV VFVSGIKHKM LFFMSFVLPS VILVTAHN LFWDFSYY ESLQSIVENT NTMFLPVDMQ 

OV43 LAMLLVKHKH LYLTMYITPV LFTLLYNNY- -LWYKHTFR GYV YAWLS YY VPSVEYTYTD 

BoCoV LAMLLVKHKH LYLTMYIIPV LFTLLYNNY- -LVVYKQTFR GYVYAWLSYY VPSVEYTYTD 

MHV FAMLLVKHKH LYLTMFIMPV LCTLFYTNY- -LWYKQSFR GLAYAWLSHF VPAVDYTYMD 

Of) AIBV FISFTVKHVM AYMDTFLLPT LITVIIGVCA EVPFIYNTLI SQWIFLSQW YDPWFDTMV 

SARS CoV CAMLLVKHKH AFLCLFLLPS LATVAYFN MVYMPASWV MRIMTWLELA DTSLSGYRLK 

1 I ....(....I ....I.. ..I 1 I I 

3845 3855 3865 3875 3885 3895 

O c EMCR GLVNVLVCLF WFLH TW RFSKERFTHW FTYVCSLIAV AYTYFYSGD F 

229E GFVNIFICLF VALLH TW RFAKERCTHW CTYLFSLIAV LYTALYSYD Y 

PEDV GLVNIFVCFV VTILHGTYTW RFFN— TPASS VTYWALLTA AYNYFYASD 1 

TGEV GVMLTVFCFI VFVTYSVRFF TCKQSWFSLA VTTILVIFNM VKIFGTSDEP WTENQIAFCF 

ov43 EVIYGMLLLV GMVFVTLRSI NHDLFSFIMF VGRLISVFSL WYKGSNLEEE ; — I 

^0 BOCOV EV I YGMLLL I GMVFVTLRSI NHDLFSFIMF VGRVISWSL WYMGSNLEEE 1 

MHV EVLYGWLLV AMVFVTMRSI NHDVFSVMFL VGRLVSLVSM WYFGANLEEE V 

AIBV PWMFLPLVLY TAFKCVQGCY MNSFNTSLLM LYQFVKLGFV IYTSSNTLTA YTEGNWELFF 

SARS COV DCVMYASALV LLILMTARTV YDDAARRVWT LMNVITLVYK VYYGNALDQA 1 

-as ....I 1 . ... I .... I ...-I. ...I ..I .-..I t 

J 3905 3915 3925 3935 3945 3955 

EMCR LSLLVMFLCA ISSDWYIGAI VFRLSRLIIF FSPE-- SVFS VFGDVKLTLV VYLICGYLVC 

229E VSLLVMLLCA ISNEWYIGAI I FRICRFGVA FLPV— EYVS YFDGVKTVLL FYMLLGFVSC 

PEDV LSCAMTLFAS VTGNWFVGAV CYKVAVYMAL RFP TFVA IFGDIKSVMF CYLVLGYFTC 

AC) TGEV VNMLTMIVSL TTKDWMVVIA SYRIAYYIW CVMP-SAFVS DFGFMKCISI VYMACGYLFC 

^ U ov43 LLMLASLFGT YTWT TVL SMAVAKVIAK WVAVNVLYFT . DIPQIKIVLL CYLFIGYIIS 

BOCOV LLMLASLFGT YTWT TAL SMAAAKVIAK WVAVNVLYFT DIPQIKIVLV CYLFIGYIIS 

MHV LLFLTSLFGT YTWT TML SLATAKVIAK WLAVNVLYFT DVPQVKLVLL SYLCIGYVCC 

AIBV ELVHTTVLAN VSSNSLIGLF VFKCAKWMLY YCN AT YLNNYVLMAV MVNCIGWLCT 

45 SARS COV SMWALVISVT SNYSGVVTTI MFLARAIVFV CVEYYPLLFI TGNTLQCIML VYCFLGYCCC ■ 

....|....| ....I. ...I ..-.I.. ..I ....I.. ..I 

3965 3975 3985 3995 4005 4015 

EMCR TYWGILYWFN RFFKCTMGVY DFKVSAAEFK YMVANGLHAP YGPFDALWLS FKLLGIGGDR 

50 229E MYYGLLYWIN RFCKCTLGVY DFCVSPAEFK YMVANGLNAP NGPFDALFLS FKLMGIGGPR 
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RFFRLCNELA 
RFYRLSNELA 
RFYRLANECA 
RFYRLANECA 
RFYRLANECA 
RIYRLYNECA 
RFYRLANECA 

I 



THSVRLTITE 
THSTRLTITE 
LHSSRLSINE 
LDTMKLSMTD 
THRYRLSLKD 
THRYRLSLKD 
THRYRLSLKD 
MSFSKMGLSQ 
LHSSRLSFKE 

I | 

5015 
NFLRLRGFFD 
DFLRSQGFFD 
DFLLEQGFFS 
DFITERGFFE 
DFVLSKGLLK 
DFILSKGLLK 
EFILSKGLLK 
DFAEKAGMFK 
DFAVSKGFFK 

« ... | .... ( 

5075 
DIYEGGCIKA 
DCYEGGCITS 
DIYEGGCITA 
ECYDGGCINA 
EIYDGGCIPA 
BIYDGGCIPA 
EIYDGGCIPA 
ECYEGGCIPA 
DCYDGGCINA 

I | 

5135 
QLNLKYAISG 
QLNLKYAISG 
QLNLKYAISG 
QMNLKYAISG 
QMNLKYAISA 
QMNLKYAISA 
QMNLKYAISA 
QMNLKYAISA 
QMNLKYAISA 

5195 
MLRTLIDGVE 
MLKNLMADVD 
MLKNLI DGVE 
MLKNLMRDVD 
MLRRL I KDVD 
MLRRLIKDVD 
MLRRL I KDVD 
MLRNLIQGVE 
MLKTVYSDVE 



LLQFVTDPSL 
LLQFVTDPTL 
LLQFCSDPAL 
LLRFVTDPTL 
LLLYAADPAL 
LLLYAADPAL 
LLLYAADPAL 
LMQFVGDPAL 
LLVYAADPAM 

] | 

5025 
EGSELTLKHF 
EGSELTLKHF 
EGSELTLKHF 
EGSELTLKHF 
EGSSVDLKHF 
EGSSVDLKHF 
EGSSVDLKHF 
EGSSIPLKHF 
EGSSVELKHF 

.... | .... [ 

5085 
CEVVVTNLNK 
REWVTNLNK 
KEWVTNLNK 
REWVTNYDK 
SQVIVNNYDK 
AQVIVNNYDK 
TQVIVNNYDK 
SQVWNNLDK 
NQVIVNNLDK 

....I | 

5145 
KERART VGGV 
KERARTVGGV 
KERARTVGGV 
KARARTVGGV 
KNRARTVAGV 
KNRARTVAGV 
KNRARTVAGV 
KNRARTVAGV 
KNRARTVAGV 



IIASSPALVD 
IVASSPALVD 
LIASSPALVD 
LVASSPALLD 
HVASASALYD 
HVASASALYD 
HVASASALLD 
LVGTSNNLVD 
HAASGNLLLD 

•*..|....| 

5035 
FFAQNGDAAV 
FFTQKGDAAI 
FFAQKVDAAV 
FFAQGGEAAM 
FFTQDGNAAI 
FFTQDGNAAI 
FFTQDGNAAI 
FYPQTGNAAI 
FFAQDGNAAI 

I .... I 

5095 
SAGWPLNKFG 
SAGWPLNKFG 
SAGYPLNKFG 
SAGYPLNKFG 
SAGYPFNKFG 
SAGYPFNKFG 
SAGYPFNKFG 
SAGYPFNKFG 
SAGFPFNKWG 



....)..,. | 

5155 
SLLSTMTTRQ 
SLLATMTTRQ 
SLLSTMTTRQ 
SLLSTMTTRQ 
SILSTMTGRM 
SILSTMTGRM 
SILSTMTGRM 
S ILSTMTNRQ . 
SICSTMTNRQ 



5205 
NPMLMGWDYP 
DPKLMGWDYP 
NPCLMGWDYP 
NGCLMGWDYP 
NPVLMGWDYP 
NPVLMGWDYP 
SPVLMGWDYP 
DPILMGWDYP 
TPHLMGWDYP 



I - 



••••!....) 

5255 
QVLTEWYSN 
QVLTEVVYSN 
QVLTEWYSN 
QVLTEVVHCT 
QVLSEIVMCG 
QVLSEIVMCG 
QVLSEIVMCG 
QVLSETVLAT 
QVLSEMVMCG 



I | 

5265 
GGFYFKPGGT 
GGFYFKPGGT 
GGFYLKPGGT 
GGFYFKPGGT 
GCYYVKPGGT 
GCYYVKPGGT 
GCYYVKPGGT 
GGIYVKPGGT 
GSLYVKPGGT 



-I.-.. I 
5215 
KCDRALPNMI 
KCDRAMPSMI 
KCDRALPNMI 
KCDRALPNMI 
KCDRAMPNLL 
KCDRAMPNIL 
KCDRAMPNIL 
KCDRAMPNLL 
KCDRAMPNML 

1 | 

5275 
TSGDASTAYA 
TSGDATTAYA 
TSGDATTAYA 
TSGDGTTAYA 
SSGDATTAFA 
SS.GDATTAFA. 
SSGDATTAFA 
SSGDATTAYA 
SSGDATTAYA 



5305 
PSDSCNNVNV 
NSSNCNNFNV 
DSNVCHNLEV 
DSNACNNVTV 
NGNKIEDLSI 
NGNKIEDLSI 
NGHKIEDLSI 
ITRDIVYDNI 



5315 
RDLQRRLYDN 
KKLQRQLYDN 
KQLQRKLYEC 
KSIQRKIYDN 
RALQKRLYSH 
RALQKRLYSH 
RELQKRLYSN 
KSLQYELYQQ 



5325 
CYRLTSVEES 
CYRNSNVDES 
CYRSTIVDDQ 
CYRSSSIDEE 
VYRSDKVDST 
VYRSDMVDST 
VYRADHVDPA 
VYRRVNFDPA 



I. ...| 

5335 
FIDDYYGYLR 
FVDDFYGYLQ 
FVVEYYGYLR 
FVVEYFSYLR 
FVTEYYEFLN 
FVTEYYEFLN 
FVNEYYEFLN 
FVEKFYSYLC 



SARS COV 



5 EMCR 
229E 
PEDV 
TGEV 
OV43 
10 BoCoV 
MHV 
AIBV 

SARS CoV 

15 

EMCR 

229E 

PEDV 
2 0 TGEV 

OV43 

BoCoV 

MHV 

AIBV 
25 SARS COV 



EMCR 
30 229E 

PEDV 

TGEV 

OV43 

BOCOV 
35 MHV 

AIBV 

SARS CoV 



40 

EMCR 
229E 
PEDV 
TGEV 
45 OV43 
BoCoV 
MHV 
AIBV 
SARS COV 

50 



EMCR 

229E 
55 PEDV 

TGEV 

OV43 

BoCoV 

MHV 
60 AIBV 

SARS CoV 



65 EMCR 

229E 

PEDV 

TGEV 

OV43 
7 0 BoCoV 

MHV 

AIBV 

SARS CoV 



75 



80 



NSVFNICQAV TANVNALLST DGNKIADKYV RNLQHRLYEC LYRNRDVDHE FVDEFYAYLR 



■ I I 

5345 
KHFSMMILSD 
KHFSMMILSD 
KHFSMMILSD 
KHFSMMILSD 
KHFSMMILSD 
KHFSMMILSD 
KHFSMMILSD 
KNFSLMILSD 
KHFSMMILSD 



I I 

5355 
DGVVCYNKDY 
DSWCYNKTY 
DGWCYNNDY 
DGWCYNKDY 
DGVVCYNSDY 
DGWCYNSDY 
DGVVCYNSEF 
DGWCYNNTL 
DAVVCYNSNY 



| 1 

53 65 
AELGYIADIS 
AGLGYIADIS 
ASLGYVADLN 
ADLGYVADIN 
ASKGYIANIS 
ASKGYIANIS 
ASKGYIANIS 
AKQGLVADIS 
AAQGLVASIK 



•»!•>•• I 
5375 
AFKATLYYQN 
AFKATLYYQN 
AFKAVLYYQN 
AFKATLYYQN 
AFQQVLYYQN 
AFQQVLYYQN 
AFQQVLYYQN 
GFREVLYYQN 
NFKAVLYYQN 



i 1 

5385 
NVFMSTSKCW 
GVFMSTAKCW 
NVFMSASKCW 
NVFMSTSKCW 
NVFMSESKCW 
NVFMSESKCW 
NVFMSEAKCW 
NVFMADSKCW 
NVFMSEAKCW 



5395 
VEEDLTKGPH 
TEEDLSIGPH 
IEPDINKGPH 
VEPDLSVGPH 
VEHDINNGPH 
VENDINNGPH 
VETDIEKGPH 
VEPDLEKGPH 
TETDLTKGPH 



5405 
EFCSQHTMQI 
EFCSQHTMQI 
EFCSQHTMQI 
EFCSQHTLQI 
EFCSQHTMLV 
EFCSQHTMLV 
EFCSQHTMLV 
EFCSQHTMLV 
EFCSQHTMLV 



5415 
VDKDGTYYLP 
VDENGKYYLP 
VDKEGTYYLP 
VGPDGDYYLP 
KMDGDDVYLP 
KMDGDDVYLP 
KMDGDEVYLP 
EVDGEPKYLP 
KQGDDYVYLP 



5425 
YPDPSRILSA 
YPDPSRIISA 
YPDPSRILSA 
YPDPSRILSA 
YPNPSRILGA 
YPVPSRILGA 
YPDPSRILGA 
YPDPSRILGA 
YPDPSRILGA 



I I 

5435 
GVFVDDVVKT 
GVFVDDITKT 
GVFVDDVVKT 
GVFVDDIVKT 
GCFVDDLLKT 
GCFVDDLLKT 
GCFVDDLLKT 
CVFVDDVDKT 
GCFVDDIVKT 

1 



.. I 

5445 
DAVVLLXRYV 
DAVILLERYV 
DAW LLERY V 
DNVIMLERYV 
DSVLLIERFV 
DSVLLIERFV 
DSVLLIERFV 
EPVAVMERYI 
DGTLMIERFV 



| | 

5455 
SLAIDAYPLS 
SLAIDAYPLS 
SLAIDAYPLS 
SLAIDAYPLT 
SLAIDAYPLV 
SLAIDAYPLV 
SLAIDAYPLV 
ALAIDAYPLV 
SLAIDAYPLT 



5465 
KHPNSEYRKV 
KHPKPEYRKV 
KHENPEYKKV 
KHPKPAYQKV 
YHENEEYQKV 
YHENEEYQKV 
YHENPEYQNV 
HHENEEYKKV 
KHPNQEYADV 

• •*. !.••»! 

5525 
LQAAGLCWC 
LQAAGLCWC. 
LQSAGLCWC 
LQAAGMCVVC 
MQSVGACWC 
MQSVGACWC 
MQSVGACWC 
LQSCGVCWC 
LQAVGACVLC 



5475 
FYVLLDWVKH 
FYALLDWVKH 
FYVLLDWVKH 
FYTLLDWVKH 
FRVYLAYIKK 
FRVYLEYIKK 
FRVYLEYIKK 
FFVLLAYIRK 
FHLYLQYIRK 

. . : : I I 

5535 
GSQTVLRCGD 
GSQTVLRCGD 
GSQTVLRCGD 
GSQTVLRCGD 
SSQTSLRCGS 
SSQTSLRCGS 
SSQTSLRCGS 
NSQTILRCGN 
NSQTSLRCGA 



5485 
LNKNLNEGVL 
LNKTLNEGVL 
LYKTLNAGVL 
LQKNLNAGVL 
LYNDLGNQIL 
LYNELGNQIL 
LYNDLGNQIL 
LYQELSQNML 
LHDELTGHML 

I I 

5545 
CLRKPMLCTK 
CLRRPMLCTK 
CLRRPMLCTK 
CLRRPLLCTK 
CIRKPLLCCK 
CIRKPLLCCK 
CIRKPLLCCK 
CIRKPFLCCK 
CIRRPFLCCK 



5495 
ESFSVTLLDN 
ESFSVTLLDE 
ESFSVTLLED 
DSFSVTMLEE 
DSYSVILSTC 
DSYSVILSTC 
DSYSVILSTC 
MDYSFVMDID 
DMYSVMLTND 

I I 

5555 
CAYDHVFGTD 
CAYDHVFGTD 
CAYDHVIGTT 
CAYDHVMGTK 
CCY-DHVMATD 
CCYDHVMATD 
CAYDHVMSTD 
CCYDHVMHTD 
CCYDHVISTS 



5505 
QEDKFWCEDF 
HESKFWDESF 
STAKFWDESF 
GQDKFWSEEF 
DGQKFTDESF 
DGQKFTDESF 
DGQKFTDETF 
KGSKFWEQEF 
NTSRYWEPEF 

| I 

5565 
HKFILAITPY 
HKFILAITPY 
HKFILAITPY 
HKFIMSITPY 
HKYVLSVSPY 
HKYVLSVSPY 
HKYVLSVSPY 
HKNVLSINPY 
HKLVLSVNPY 



.•..I I 

5515 
YASMYENSTI 
YASMYEKSTV 
YANMYEKSAV 
YASLYEKSTV 
YKNMYLRSAV 
YKNMYLRSAV 
YKNMYLRSAV 
YENMYRAPTT 
YEAMYTPHTV 

| | 

5575 
VCNASGCGVS 
VCNTSGCNVN 
VCCASDCGVN 
VCSFNGCNVN 
VCNAPGCDVN 
VCNAPGCDVN 
VCNSPGCDVN 
ICSQLGCGEA 
VCNAPGCDVT 



I 



I 



5585 
DVKKLYLGGL 
DVTKLYLGGL 
DVTKLYLGGL 
DVTKLFLGGL 
DVTKLYLGGM 
DVTKLYLGGM 
DVTKLYLGGM 
DVTKLYLGGM 
DVTQLYLGGM 

....|....| 

5645 
DYKLANDVKD 
DYKLANDAKE 
DYRLANDVKD 
DYKLANNVKE 
DYILANECTE 
DYILANECTE 
DYVLANECTE 
PYILANRCSD 
DYILANTCTE 



5595 
NYYCTNHKPQ 
NYYCVDHKPH 
SYWCHEHKPR 
SYYCMNHKPQ 
SYYCEDHKPQ 
SYYCEDHKPQ 
SYYCEDHKPQ 
SYFCGNHKPK 
SYYCKSHKPP 

| I 

5655 
TLRLFAAETI 
SLRLFAAETV 
SLRLFAAETI 
SLKIFAAETV 
RLKLFAAETQ 
RLKLFAAETQ 
RLKLFAAETQ 
SLRRFAAETV 
RLKLFAAETL 



5605 
LSFPLCSAGN 
LSFPLCSAGN 
LAFPLCSAGN 
LSFPLCANGN 
YSFiCLVMNGL 
YSFKLVMN04 
YSFKLVMNGM 
LSIPLVSNGT 
ISFPLCANGQ 

••••(■•••I 

5665 
KAKEESVKSS 
KAKEESVKSS 
KAKEESVKSS 
KAKEESVKSE 
KATEEAFKQS 
KATEEAFKQS 
KATEESFKQC 
KATEELHKQQ 
KATEETFKLS 



5615. 
IFGLYKNSAT 
VFGLYKSSAL 
VFGLYKNSAT 
VFGLYKSSAV 
VFGLYKQSCT 
VFGLYKQSCT 
VFGLYKQSCT 
VFGIYRANCA 
VFGLYKNTCV 

5675 
YAFATLKEW 
YAYATLKEIV 
YACATLHEW 
YAYAVLKEVI 
YASATIQEIV 
YASATIQEIV 
YASATIREIV 
FASAEVREVF 
YGIATVREVL 



5625 
GSLDVEVFNR 
GSMDIDVFNK 
GSPDVEDFNR 
GSEAVEDFNK 
GSPYIDDFNR 
GSPYIDDFNR 
GSPYIEDFNK 
GSENVDDFNQ 
GSDNVTDFNA 

• » ■ ■ | • • • • | 

5685 
GPKELLLSWE 
GPKELLLLWE 
GPKELLLKWE 
GPKEIVLQWE 
SERELILSWE 
SERELILSWE 
SDRELILSWE 
SDRELILSWE 
SDRELHLSWE 



5635 
LATSDWTDVR 
LSTSDWSDIR 
IATSDWTDVS 
LAVSDWTNVE 
IASCKWTDVD 
IASCKWTDVD 
IASCKWTEVD 
LATTNWSIVE 
IATCDWTNAG 

I I 

5695 
SGKVKPPLNR 
SGKAKPPLNR 
VGRPKPPLNR 
ASKTKPPLNR 
IGKVKPPLNK 
IGKVKPPLNK 
IGKVRPPLNK 
PGKTRPPLNR 
VGKPRPPLNR 



I 



5755 



EMCR 
229E 
PEDV 
TGEV 



..|....| . ... I .... I 
5705 5715 5725 5735 5745 

NSVFTCFQIS KDSKFQIGEF IFEKVEYGSD TVTYKSTVTT KLVPGMI FVL TSHNVQPLRA 
NSVFTCFQIT KDSKFQVGEF VFEKVDYGSD TVTYKSTATT KLVPGMLFIL TSHNVAPLRA 
NSVFTCYHIT KNTKFQIGEF VFEKAEYDND AVTYKTTATT KLVPGMVFVL TSHNVQPLRA 
NSVFTCFQIS KDTKIQLGEF VFEQSEYGSD SVYYKSTSTY KLTPGMIFVL TSHNVSPLKA 
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OV43 
BoCoV 
MHV 
AIBV 

SARS CoV 



EMCR 

22 9E 

PEDV 

TGBV 

OV43 

BoCoV 

MHV 

AIBV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BoCoV 

MHV 

AIBV 

SARS COV 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BOCOV 

MHV 

AIBV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BoCoV 

MHV 

AIBV 

SARS COV 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BoCoV 

MHV 

AIBV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BoCoV 

MHV 

AIBV 

SARS CoV 



NYVFTGYHFT 
NYVFTGYHFT 
NYVFTGYHFT 
NYVFTGYHFT 
NYVFTGYRVT 



KNGKTVIiGEY 
KNGKTVLGEY 
SNGKTVLGEY 
RTSKVQLGDF 
KNSKVQIGEY 



....I.... I 

5765 
PTIANQEKYS 
PTMANQEKYS 
PTIANQERYS 
PILVNQEKYN 
PTLVPQENYS 
PTLVPQENYS 
PTLVPQENYT 
PTLCPQQTFS 
PTLVPQEHYV 

I | 

5825 
YYPGARIVFV 
YYPGARIVFT 
YYPGARIVFT 
YYPQARIVYT 
FYCTARWYT 
YYCTARVVYT 
YYCTARWYT 
YFSSARWFT 
YYPSARIVYT 



....|. ...| 

5775 
SIYKLHPAFN 
TIYKLHPSFN 
TIHKLHPAFN 
TISKLYPVFN 
SIR-FASVYS 
SIR-FASVYS 
SIR-FASVYS 
RFVNLRPNVM 
RITGLYPTLN 

I | 

5835 
ACAHAAVDSL 
ACSHAAVDSL 
ACS HAA VDSL 
AC SH AAV DAL 
AASHAAV DAL 
AASHAAVDAL 
AASHAAVDAL 
ACSHAAVDAL 
AC SHAAV DAL 



VFDKSELT- 
VFDKSELT- 
VFDKSELT- 
TFEKGEGK- 
TFEKGDYG- 



N GVYYRATTTY 
N GVYYRATTTY 
N GVYYRATTTY 
D WYYKATSTA 
D AVVYRGTTTY 



KLSVGDVFVL 
KLSVGDVFVL 
KLSVGDVFIL 
KLSVGDIFVL 
KLNVGDYFVL 



TSHSVANLSA 
TSHSVANLSA 
TSHAVSSLSA 
TSHNWSLVA 
TSHTVMPLSA 



I | 

5885 
STVNALPECN 
STVNALPEVN 
STVNALPECN 
CTVNALPEAS 
TTINALPEMV 
TTINALPEMV 
TTINALPELV 
STINALPEVS 
CTVNALPETT 

....)....! 

5945 
EPVDYNWTQ 
EPIDYNVVTQ 
EPKDYNWTQ 
QPQDYNWTK 
EPKYFNTVTK 
EPKYFNTVTK 
EPRYFNSVTK 
SPKDYNWTN 
EPEYFNSVCR 

• •••(»••. | 
6005 

GNVQVDN 

GSVQVDN 

GNVQVDN 

GQVQIES 

GVTTHES 

GVTTHES 

GQTTHES 

NGNSDVGHES 
GVITHDV 



U.-.-l 

6065 
SSQGSEYDYV 
SAQGSEYDYV 
SSQGSEYDYV 
SAQGSEYDYV 
SAQGSEYDYV 
SAQGSEYDYV 
SAQGSEYDFV 
SSQGSEYDYV 
SSQGSEYDYV 

• ••»!••«• | 
6125 



I | 

5895 
ADIWVDEVS 
ADIVVVDEVS 
ADIWVDEVS 
CDIVWDEVS 
TDIVVVDEVS 
TDIVWDEVS 
TDIIWDEVS 
CDILLVDEVS 
ADIWFDEIS 

I | 

5955 
RMCAIGPDVF 
RMCAIGPDVF 
RMCALKPDVF 
RMCTLGPDVF 
LMCCLGPDIF 
LMCCLGPDIF 
LMCCLGPDIF 
LMVCVKPDIF 
LMKTIGPDMF 

•»••!••. . | 

6015 
GSSINRKQLE 
GSSINRRQLD 
GSSINRRQLD 
NSSINNKQLE 
SSAVNMQQIY 
SSAVNMQQIY 
SSAVNMQQIY 
GSAYNTTQLE 
SSAINRPQIG 

6075 
IYAQTSDTAH 
IFAQTSDTAH 
IYAQTSDTAH 
IYTQTSDTQH 
IYSQTAETAH 
IYSQTAETAH 
IYSQTAETAH 
IFCVTADSQH 
IFTQTTETAH 



5785 
VSDAYANLVP 
VSDAYANLVP 
IPEAYSSLVP 
IAEAYNTLVP 
VLETFQNNW 
VLETFQNNW 
VPETFQNNVP 
VPECFVNNIP 
ISDEFSSNVA 

I | 

5845 
CAKAMTVYSI 
CAKAVTAYSV 
CVKASTAYSN 
CEKAAKNFNV 
CEKAYKFLNI 
CEKAYKFLNI 
CEKAYKFLNI 
CEKAFKFLKV 
CEKALKYLPI 

»•••!••••[ 

5905 
MCTNYDLSVI 
MCTNYDLSVI 
MCTNYDLSVI 
MCTNYDLSVI 
MLTNYELSVI 
MLTNYELSVI 
MLTNYELSVI 
MLTNYELSFI 
MATNYDLSW 



* I I 

5965 
LHKCYRCPAE 
LHKCYRCPAE 
LHKCYRCPAE 
LHKCYRCPAE 
LGTCYRCPKE 
LGTC YRCPKE 
LGTCYRCPKE 
LAKC YRCPKE 
LGTCRRCPAE 

6025 " 1 
IVKLFLVKNP 
WKRFIHKNS 
WRMFLAKNP 
VVKAFLAHNP 
LINKFLKANP 
LINKFLKANP 
LISKFLKANP 
FVKDFVCRNK 
WREFLTRNP 

- . I V. \ . | 

6085 
ACNVNRFNVA 
ACNANRFNVA 
ASNVNRFNVA 
ATNVNRFNVA 
SVNVNRFNVA 
SVNVNRFNVA 
SVNVNRFNVA 
ALNINRFNVA 
SCNVNRFNVA 



5795 
YYQLIGKQKI 
YYQLIGKQRI 
YYQLIGKQKI 
YYQMIGKQKF 
NYQHIGMKRY 
NYQHIGMKRY 
NYQHIGMKRY 
LYHLVGKQKR 
NYQKVGMQKY 

5855 
DKCTRIIPAR 
DKCTRIIPAR 
DKCSRIIPQR 
DRCSRIIPQR 
NDCTRIVPAK 
NDCTRIVPAK 
NDCTRIVPAK 
DDCTRIVPQR 
DKCSRIIPAR 

5915 
NQRLSYKHIV 
NQRISYKHIV 
NQRISYRHW 
NSRLSYKHIV 
NARIRAKHYV 
NARIRAKHYV 
NSRVRAKHYV 
NGKINYQYW 
NARLRAKHYV 

-••-I I 

5975 
IVNTVSELVY 
IVNTVSELVY 
IVRTVSEMVY 
IVKTVSALVY 
IVDTVSALVY 
IVDTVSALVY 
IVDTVSALVY 
IVDTVSTLVY 
IVDTVSALVY 



5805 
TTIQGPPGSG 
TTIQGPPGSG 
TTIQGPPGSG 
TTIQGPPGSG 
CTVQGPPGTG 
CTVQGPPGTG 
CTVQGPPGTG 
TTVQGPPGSG 
STLQGPPGTG 

....I.... | 

5865 
ARVECYSGFK 
ARVECYSGFK 
ARVECYDGFK 
I RVDC YTGFK 
VRVECYDKFK 
VRVECYDKFK 
VRVDCYDKFK 
TTVDCFSKFK 
ARVECFDKFK 



I 



I | 

5925 
YVGDPQQLPA 
YVGDPQQLPA 
YVGDPQQLPA 
YVGDPQQLPA 
YIGDPAQLPA 
YIGDPAQLPA 
YIGDPAQLPA 
YVGDPAQLPA 
YIGDPAQLPA 

....|.-.. | 

5985 
ENKFVPVKPA 
ENKFVPVKEA 
ENQFIPVHPD 
ENKFVPVNPE 
ENKLKAKNES 
ENKLKAKNES 
HNKLKAKNDN 
DGKF1ANNPE 
DNKLKAHKDK 



5815 
KSHCSIGLGL 
KSHCSIGIGV 
KSHCVIGLGL 
KSHCVIGLGL 
KSHLAIGLAV 
KSHLAIGLAV 
KSHLAIGLAV 
KSHFAIGLAV 
KSHFAIGLAL 

»•••!»•. «| 

5875 
PNNTSAQYIF 
PNNNSAQYVF 
SNNTSAQYLF 
PNNTNAQYLF 
INDTTRKYVF 
INDTTRKYVF 
VNDTTRKYVF 
ANDTGKKYIF 
VNSTLEQYVF 

I | 

5935 
PRVMITKGVM 
PRVLISKGVM 
PRVMISRGTL 
PRTLINKGVL 
PRVLLSKGTL 
PRVLLSKGTL 
PRVLLNKGTL 
PRTLLN-GSL 
PRTLLTKGTL 



6135 



I 
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6145 



I 
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6035 
SWSKAVFISP 
TWSKAVFISP 
RWSKAVFISP 
KWRKAVFISP 
LWHKAVFISP 
LWHKAVFISP 
SWSNAVFISP 
QWREAIFISP 
AWRKAVFISP 

6095 
ITRAKKGIFC 
ITRAKKGIFC 
ITRAKKGILC 
ITRAKVGILC 
ITRAKKGILC 
ITRAKKGILC 
ITRAKKGILC 
LTRAKRGILV 
ITRAKIGILC 

I | 

6155 



I | 

6045 
YNSQNYVASR 
YNSQNYVAAR 
YNSQNYVASR 
YNSQNYVARR 
YNSQNFAAKR 
YNSQNFAAKR 
YNSQNYVAKR 
YNAMNQRAYR 
YNSQNAVASK 

••••(•■••| 

6105 
VMCDKT-LFD 
IMSDRT-LFD 
IMCDRS-LFD 
IMCDRT-MYE 
VMSNMQ-LFE 
VMSNMQ-LFE 
VMSSMQ-LFE 
VMRQRDELYS 
IMSDRD-LYD 
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6165 



I 
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5995 
SKQCFKIFFK 
SKQCFKIFER 
SKQCFKIFCK 
SKQCFKMFVK 
SSLCFKVYYK 
SSLCFKVYYK 
SSMCFKVYYK 
SRECFKVIVN 
SAQCFKMFYK 

I | 

6055 
FLGLQI QTVD 
LLGLQTQTVD 
LLGLQIQTVD 
LLGLQTQTVD 
VLGLQTQTVD 
VLGLQTQTVD 
VLGLQTQTVD 
MLGLNVQTVD 
ILGLPTQTVD 

• » • • 1 • . . • | 
6115 

SLKFFEIKHA 
ALKFFEITMT 
LLKFFELKLS 
NLDFYELKDS 
ALQFTTLTLD 
ALQFTTLTVD 
SLNFSTLTLD 
ALKFTELDSE 
KLQFTSLEIP 

I | 

6175 



„ M ™ DLHSS -QVCGLFKNC TRTPLNLPPT HAHTFLSLSD QFKTTGDLAV QIGS-N-NVC 

DLQSE -SSCGLFKDC ARNPIDLPPS HATTYLSLSD RFKTSGDLAV QIGN-N-NVC 

UnV DLQAN -EGCGLFKDC SRGDDLLPPS HANTFMSLAD NFKTDQYLAV QIGV-N-GPI 

TGEV KI GLQAK PETCGLFKDC SKSEQYIPPA YATTYMSLSD NFKTSDGLAV NIG--T-KDV 

c OV43 KVPQAVETKV QCSTNLFKDC SKSYSGYHPA HAPSFIAVDD KYKATGDLAV CLGIGD-SAV 

B0C0V KVPQAVETRV QCSTNLFKDC SKSYSGYHPA HAPSFIAVDD KYKATGDLAV CI.GIGD-SAV 

mSv KIN-— NPRL QCTTNLFKDC SRSYAGYHPA HAPSFIAVDD KYKVGGDIAV CLNVAD-SAV 

"t„v T- S— LQGTGLFKIC NKEFSGVHPA YAVTTKAIAA TYKVNDEIAA LVNVEAGSEI 

SARS COV RRN-VATLQA ENVTGLFKDC SKIITGLHPT QAPTHLSVDI KFKTEG-LCV DIPGIP-KDM 

10 i i i i — i — i — i — i — i — i 1 

6185 6195 6205 6215 6225 6235 

vmtr TYEHVISFMG FRFDISIPGS HSLFCTRDFA IRNVRGWLGM DVESAHVCGD NIGTNVPLQV 

S TYEHVISYMG FRFDVSMPGS HSLFCTRDFA MRHVRGWLGM DVEGAHVTGD NVGTNVPLQV 

-I c p|nv KYEHVISFMG FRFDINIPNH HTLFCTRDFA MRNVRGWLGF DVEGAHWGS NVGTNVPLQL 

KYANVISYMG FRFEANIPGY HTLFCTRDFA MRNVRAWLGF DVEGAHVCGD NVGTNVPLQL 
OV« T^RLISLMG FKLDVTLDGY CKLFITKEEA VKRVRAWVGF DAEGAHATRD SIGTNFPLQL 

BOCOV TYSRLISLMG FKLDVTLDGY CKLFITKEEA VKRVRAWVGF DAEGAHATRD SIGTNFPLQL 

MHV TYSRLISLMG FKLDLTLDGY CKLFITRDEA IRRVRAWVGF DAEGAHATRD SIGTNFPLQL 

20 AIBV TYKHLISLLG FKMSVNVEGC HNMFITRDEA IRNVRGWVGF DVEATHACGT NIGTNLPFQV 

SARS COV TYRRLISMMG FKMNYQVNGY PNMFITREEA IRHVRAWIGF DVEGCHATRD AVGTNLPLQL 

-■■£ks- [ •••6255--' •••6265'- 1 — •fei"' " ^ ' ' " ' 6295" ' ' 
oc pw-R GFSNGVNFW QTEGCVSTNF GDVIKPVCAK SPPGEQFRHL VPFLRKGQPW LIVRRRIVQM 

GFSNGVDFVA QPEGCVLTNT GSWKPVRAR APPGEQFTHI VPLLRKGQPW SVLRKRIVQM 

lllv Sgvdfvv rpegcwtes gdyikpvrar APPGEQFAHL lpllkrgqpw dvvrkrivqm 

TGEV GFSNGVDFVV QTEGCVITEK GNSIEWKAR APPGEQFAHL IPLMRKGQPW HIVRRRIVQM 

OV« GFSTGIDFW emglfadrd gysfkkavak appgeqfkhl iplmtrghrw dwrprivqm 

1C\ B«r«V GFSTGIDFVV EATGLFADRD GYSFKKAVAK APPGEQFKHL IPLMTRGQRW DWRPRIVQM 

30 ^V GFSTGI DFVV EATGMFAERD GYVFKKAVAR APPGEQFKHL VPLMSRGQKW DWRIRIVQM 

MBV GFSTGADFW TPEGLVDTSI GNNFEPVNSK APPGEQFNHL RVLFKSAKPW HVIRPRIVQM 

SARS COV GFSTGVNLVA VPTGYVDTEN NTEFTRVNAK PPPGDQFKHL IPLMYKGLPW NWRIKIVQM 

•ac i i ...I.. ..I ....I 1 I 1 1 — 

35 "'6305 63i5 6325 6335 634S 6355 

EMCR ISDYLSNLSD ILVFVLWAGS LELTTMRYFV KIGP-IKYCY CGNSATCYNS VSNEYCCFKH 

S IADFLAGSSD VLVFVLWAGG LELTTMRYFV KIGA-VKHCQ CGTVATCYNS VSNDYCCFKH 

PEDV CSDYLANLSD ILIFVLWAGG LELTTMRYFV KIGP-SKSCD CGKVATCYNS ALHTYCCFKH 

40 TGEV VCDYFDGLSD ILIFVLWAGG LELTTMRYFV KIGR-PQKCE CGKSATCYSS SQSVYACFKH 

4U ot43 FADHLIDLSD CVVLVTWAAN FELTCLRYFA KVGREISCNV CTKRATVYNS RTGYYGCWRH 

FADHLIDLSD CVVLVTWAAN FELTCLRYFA KVGREISCNV STKRATAYNS RTGYYGCWRH 
LSDHLVDLAD SWLVTWAAS FELTCLRYFA KVGKEVVCSV CNKRATCFNS RTGYYGCWRH 
LADNLCNVSD CWFVTWCHG LELTTLRYFV KIGK-EQVCS CGSRATTFNS HTQAYACWKH 
LSDTLKGLSD RWFVLWAHG FELTSMKYFV KIGPERTCCL CDKRATCFST SSDTYACWNH 

•-6365"' -«&»■' ' ' ' kiks ' ' ""An" 1 ' "iioi* ' ' "* " ' A* ' ' 
EMCR ALGCDYVYNP YAFDIQQWGY VGSLSQNHHT FCNIHRNEHD ASGDAVMTRC LAVHDCFVKN 

22 qe ALGCDYVYNP YVIDIQQWGY VGSLSTNHHA ICNVHRNEHV ASGDAIMTRC LAVYDCFVKN 

50 | E DV ALGCDYLYNP YCIDIQQWGY KGSLSLNHHE HCNVHRNEHV ASGDAIMTRC LAIHDCFVKN 

TGEV A^CdSp YCIDIQQWGY TGSLSMNHHE VCNIHRNEHV ASGDAIMTRC I*AIHDCFVKR 

SVTCDYIjYNP LIVDIQQWGY IGSLSSNHDL YCSVHKGAHV ASSDAIMTRC LAVYDCFCNN 
BOCOV sTOdS LIVDIQQWGY IGSLSSNHDL YCSVHKGAHV ASSDAIMTRC LAVYDCFCNN 

c c S^ V ° V SYSCdSyNP LIVDIQQWGY TGSLTSNHDL ICSVHKGAHV ASSDAIMTRC LAVHDCFCKS 

55 CLGFDFVYNP LLVDIQQWGY SGNLQFNHDL HCNVHGHAHV ASVDAIMTRC LAINNAFCQD 

SARS COV SVGFDYVYNP FMIDVQQWGF TGNLQSNHDQ HCQVHGNAHV ASGDAIMTRC LAVHBCFVKR 

1 \ | I ....I.-.. I •-••!• t 

""6425 6435 6445 6455 6465 6475 

EMCR VDWTVTYPFI ANEKFINGCG RNVQGHVVRA ALKLYKPSVI HDIGUPKGVR CA-VTDAKWY 

S VDWSITYPMI ANENAINKGG RTVQSHIMRA AIKLYNPKAI HDIGNPKGIR CA-VTDAKWY 

PEDV VDWSITYPFI GNEAVINKSG RIVQSHTMRS VLKLYNPKAI YDIGNPKGIR C A— VTDAKWF 

TGEV VDWSIVYPFI DNEEKINKAG RIVQSHVMKA ALKIFNPAAI HDVGNPKGIR CA-TTPIPWF 

fit: QV43 INWNVEYPII SNELSINTSC RVLQRVILKA AMLCNRYTLC YDIGN PRAIA CV— KDFDFK 

BOCOV INWNVEYPII SNELSINTSC RVLQRVMLKA AMLCNRYTLC YDIGNPKAIA CV--KDFDFK 

MHV VNWSLEYPII SNEVSVNTSC RLLQRVMFRA AMLCNRYDVC YDIGNPKGLA CV— KGYDFK 

AIBV VNWDLTYPHI ANEDEVNSSC RYLQRMYLNA CVDALKVNVV YDIGNPKGIK CVRRGDVNFR 

SARS COV VDWSVEYPII GDELRVNSAC RKVQHMVVKS ALLADKFPVL HDIGNPKAIK CVPQAEVEWK 

70 . I . ... I .... | 1 I ....I. -.-I 

6495 6505 6515 6525 6535 

— VKLLDYD YATHG — QLD GLCLFWNCNV DMYPEFSIVC RFDTRTRSVF 
t Ll ^ L ,__, — VKTLEYD YMTHG — QMD GLCLFWNCNV DMYPEFSIVC RFDTRTRSTL 

7 5 p E DV CFDKNPTNSN VKTLEYD YITHG — QFD GLCLFWNCNV DMYPEFSWC R™TRCRSPL 

/D VRCLDYD YMVHG — QMN GLMLFWNCNV DMYPEFSIVC RFDTRTRSKL 

VKTLLYS FEAHKDSFKD GLCMFWNCNV DKYPPNAWC RFDTRVLNNL 

VKTLLYF FEAHKDSFKD GLCMFWNCNV DKYPPNAWC RFDTRVLNNL 

L . VKQFVYK YEAHKDQFLD GLCMFWNCNV DKYPANAWC RFDTRVLNKL 

80 AIBV FYDKNPIVRN VKQFEYD YNQHKDKFAD GLCMFWNCNV DCYPDNSLVC RYDTRNLSVF 



BOCOV 
MHV 
AIBV 
45 SARS COV 



SARS COV 


VDWSVEYPII 




6485 


EMCR 


CYDKQPVNSN 


22 9E 


CYDKNPINSN 


PEDV 


CFDKNPTNSN 


TGEV 


CYDRDPINNN 


OV43 


FYDAQPIVKS 


BoCoV 


FYDAQPIVKS 


MHV 


FYDASPVVKS 


AIBV 


FYDKNPIVRN 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



70 



75 



80 



SARS COV 



EMCR 

22 9E 

PEDV 

TGEV 

OV43 

BOCOV 

MHV 

AIBV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BOCOV 

MHV 

AIBV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BoCoV 

MHV 

AIBV 

SARS CoV 



EMCR ' 

229E. 

PEDV 

TGEV 

OV43 

BoCoV 

MHV 

AIBV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BoCoV 

MHV 

AIBV 

SARS CoV 



EMCR 

229E 

PEDV 

TGEV 

OV43 

BoCoV 

MHV 

AIBV 

SARS COV 



EMCR 
229E 
PEDV 
TGEV 



FYDAQPCSDK AYKIEELFYS YATHHDKFTD GVCLFWNCNV DRYPANAIVC RFDTRVLSNL 

....I. ...I ....|. ...| ....|. ...| ....|.. ,.| ........ , , 

654S 6555 6565 6575 6585 6595 

NLEGVNGGSL YVNKHAFHTP AYDKRAFVKL KPMPFFYFDD SDCDWQ EQVNYVPI.R 

NLEGVNGGSL YVNNHAFHTP AYDKRAMAKL KPAPFFYYDD GSCEWH DOVNYVPtR 

NLEGCNGGSL YVNNHAFHTP AFDKRAFAKL KPMPFFFYDD TECDKLQ DSINYVPLR 

SLEGCNGGAL YVNNHAFHTP AYDRRAFAKL KPMPFFYYDD SNCELVD GQPNYVPtK 

NLPGCNGGSL YVNKHAFHTK PFARAAFEHL KPMPFFYYSD TPCVYMDGMD AKQVDYVPLK 
NLPGCNGGSL YVNKHAFHTK PFSRAAFEHL KPMPFFYYSD TPCVYMDGMD AKQVDwIlK 
NLPGCNGGSL YVNKHAFHTS PFTRAAFENL KPMPFFYYSD TPCVYMEGME SK0VDYVPLR 
NLPGCNGGSL YVNKHAFYTP KFDRISFRNL KAMPFFFYDS SPCBTIQVDG -VAQDLVSLA 
NLPGCDGGSL YVNKHAFHTP AFDKSAFTNL KQLPFFYYSD SPCESHGKQV VSDIDYVPLK 

.... I .... I .... i .... i . ... i .... i . ... i .... i . ... i .... i i i 

6605 6615 6625 6635 6645 * ' ' 66SS 

ASSCVTRCNI GGAVCSKHAN LYQKYVEAYN TFTQAGFNIW VPHSFDVYNL WOIFIET-Nrr 
ATNCITKCNI GGAVCSKHAN LYRAYVESYN IFTQAGFNIW VPT?FDCYNL HOTFOOT 
ASNCITKCNV GGAVCS KHCA MYHSYVNAYN TFTSAGFTIW VPTSFDTYNL WOTFSN--m 
SNVCITKCNI GGAVCKKHAA LYRAYVEDYN IFMQAGFTIW CPQNFDTYML "SsMT 
SATCITRCNL GGAVCLKHAE EYREYLESYN TATTAGFTFW VYKTFDFYNL WN?F^! r 
SATCITRCNL GGAVCLKHAE EYREYLESYN TATTAGFTFW VYKTFDFYNL WNTFTK— — — T* 

SATCITRCNL GGAVCLKHAE DYREYLESYN TATTAGFTFW VYKTFDFYNL WNTFTR T 

TKDCITKCNI GGAVCKKHAQ MYAEFVTSYN AAVTAGFTFW VTNKLNPYnE WKSFSA~It 
SATCITRCNL GGAVCRHHAN EYRQYLDAYN MMISAGFSLW IYKQFDTYNL 5Sr™l" 

. ... I .... | . ... | .... | | | . ... | ... . I I | , , 

6665 • 6675 6685 6695 6705 ""'6715 

QSLENIAFNV VKKGCFTGVD GELPVAWND KVFVRYGDVD NLVFTNKTTL PTNVAFEI.FA 
QGLENIAFNV VNKGSFVGAD GELPVAISGD KVFVRDGNTD NLVFVNKTSL PTnTaSa 
QGLENIAFNV LKKGSFVGDE GELPVAWND KVLVRDGTVD TLVFTNKTSL PTNVAFELYA 
QSLENVAFNV VKKGAFTGLK GDLPTAVIAD KIMVRDGPTD KCIBTNKTSL CTNV^fIl^A 
QSLENVVYNL VKTGHYTGQA GBMPCAIIND KWAKIDKED VVIFINNTTY SSS 
QSLENWYNL VKTGHYTGQA GEMPCAIIND KWAKIDKED VVIFINNTTY pZ»S 
QSLENWYNL VNAGHFDGRA GELPCAVIGE KVIAKIQNED WVFKNNTPF PTOVATOLFA 
QSIDNIAYNM YKGGHYDAIA GEMPTVITGD KVFVIDQGVE KAVFVNQTTL PTSVAFELYA 
QSLENVAYNV VNKGHFDGHA GEAPVSIINN AVYTKVDGID VEIFENKTTL PVNVAfIlWA 

. ... ] . . . . | . ... | .... | . ... | .... | . ... I .... I . | I I , 

6725 6735 6745 6755 6765 ""silk"* 

KRKMGLTPPL SILKNLGWA TYKFVLWDYE AERPFTSYTK SVCKYTDFN _-__pnv 

KRKVGLTPPL SILKNLGWA TYKFVLWDYE AERPLTSFTK SVCGYTDFA EOV 

KRKVGLTPPI TILRNLGWC TSKCVIWDYE AERPLTTFTK DVCKYTDFE Inv 

KRKLGLTPPL TILRNLGWA TYKFVLWDYE AERPFSNFTK QVCSYTDLD SEV 

KRSVRHHPEL KLFRNLNIDV CWKHVIWDYA RESIFCSNTY GVCMYTDLK FIDKT 

KRSIRHHPEL KLFRNLNIDV CWKHVIWDYA RESIFCSNTY GVCMYTDLK TrnKT 

KRSIRPHPEL KLFRNLNIDV CWSHVLWDYA KDSVFCSSTY KVCKYTDLQ IciEs? 

KRNIRTLPNN RI LKGLGVDV TNGFVIWDYA NQTPLYRNTV KVCAYTDIE -PNr^ 

KRNIKPVPEI KILNNLGVDI AANTVIWDYK REAPAHVSTI GVCTMTDIAK KPTESACSSL 

....|....| . ... I .... | ....| .| ....|....| I I , . 

6785 6795 6805 6815 6825 "'sms*'' 

CVCFDNSIQG SYERFTLTTN AVLFSTWIK N LTPIK LNFGMLNGMP VSsi^nv™ 

CTCYDNSIQG SYERFTLSTN AVLFSATAVK TGGKSLPAIK LNTChSa lATVKSEnS 

SLERFSMTQN AVLMSLTAVK K LTGIK LTySgvP VN— — — THED— 

VTCFDNS I AG SFERFTTTRD AVLISNNAVK G LSAIK LQYGLLNDLP VS T^T 

NVLFDGRDNG ALEAFKRSNN GVYI STTKVK S LS MIRGPPRAEL nIvvvS^ 

NVLFDGRDNG ALEAFKRSNN GVYISTTKVK S LS MIRGPpS NGV^S^rn 

NVLFDGRDNG ALEAFKKCRD GVYINTTKIK S LS MI KGPQRADL SSrS 

WLYDDR-YG DYQSFLAADN AVLVSTQCYK R YS YVEIPSNitS nM^f^S 

TVLFDGRVEG QVDLFRNARN GVLITEGSVK G LT pSaQASV NgSeH 

•••6845"' -JIm -jA*"' -jAw"' "' • ' 

EKLVNWYTYV RKNGQFQDHY DG 6895 

IKNINWFVYV RKDGKPVDHY DG ^ 

• -KPFTWYIYT RKNGKFEDYP"DG--~ — -—-Z™^- " ' FY-TQ- 

-KPVTWYIYV RKNGEYVEQI DS " "_~~~" YFT Q 

-TDCVFYFAV RKEGQDVIFS QFDSLGVSSN QSPQGNLGSN GKPGNVGGND ALS^^^n 
RKEGQDVIFS QFDSIiRVSSN QSPQGNLGSN -EPGNvS ALATSTI FTO 

^ZESZ SSSRS g?" -RVGDLSGNE SSSSg 

-vktqfnyfk kvdg-^iiq qlp : 

. ... I .... i . ... i .... i . ... i .... i i ) i . . 

6905 691 * 6925 693k *"*69«" ' " klkh" * 

GRNLSDFTPR SDMBYDFLNM DMGVFINKYG LEDFNFEHW YGDVSKTTLG GtHT f Tcnirr, 
GRNLQDFLPR STMBBDFLNM DIGVFIQKYG LEDFNFEHW YGDVSK™ S ™ 
™S FSPR SDMEKDFLSM DMGLFINKYG LEDYGFE^W YGDVSkS GLHLwS 
GRTFETFKPR STMEEDFLSM DTTLFIQKYG LEDYGFEHW FGDVSKTTIG GMHLLISQVR 



r>V43 SRVISSFTCR TDMEKDFIAL DQDVFIQKYG LEDYAFEHIV YGNFNQKIIG GLHLLIGLYR 

BoCoV SRVISSFTCR TDMEKDFIAL DQDVFIQKYG LEDYAFEHIV YGNFNQKIIG GLHLLIGLYR 

MHV SRFLSSFAPR SEMEKDFMDL DEDVFIAKYS LQDYAFEHW YGSFNQKIIG GLHLLIGLAR 

ATRV GRSYETFEPR SDIERDFLAM SEESFVERYG -KDLGLQHIL YGEVDKPQLG GLHTVIGMYR 

5 SARS COV SRDLEDFKPR SQMETDFLEL AMDEFIQRYK LEGYAFEHIV YGDFSHGQIiG GLHLMIGLAK 

. ... | .... I . ... I .... I . ... I .... I I 1 1 I 

6965 6975 6985 6995 7005 7015 

EMCR LSKMGVLKAD DFVTASDTTL RCCTVTYLNE LSSKVVCTYM DLLLDDFVTI LK SLDLG 

1 0 229E LSKMGILKAE EFVAASDITL KCCTVTYLND PSSKTVCTYM DLLLDDFVSV LK SLDLT 

±KJ p EDV LACMGVLKID EFVSSNDSTL KSCTVTYADN PSSKMVCTYM DLLLDDFVSI LK SLDLS 

TGFV LAKMGLFSVQ EFMNNSDSTL KSCCITYADD PSSKNVCTYM DILLDDFVTI IK SLDLN 

OV4 3 RQQTSNLVVQ EFVS-YDSSI HSYFITDEKS GGSKSVCTVI DILLDDFVAL VK SLNLN 

BoCoV RQOTSNLVIQ EFVS-YDSSI HSYFITDEKS GGSKSVCTVI DILLDDFVAL VK SLNLN 

♦1 c MHV RQQKSNLVIQ EFVP-YDSSI HSYFITDENS GSSKSVCTVI DLLLDDFVDI VK SLNLN 

XJ Aibv LLRANKLNAK SVTN-SDSDV MQNYFVLSDN GSYKQVCTVV DLLLDDFLEL LRNILKEYGT 

SARS COV RSQDSPLKLB DFIP-MDSTV KNYFITDAQT GSSKCVCSVI DLLLDDFVEI IK SQDLS 

. ... | .... | ....!. ...I - . . . I I 

9fA 7025 7035 7045 7055 7065 7075 

TTMCR VISKVHEVII DNKPYRWMLW CKDNHLSTFY PQLQS-AEWK CGYAMPQIYK LQRMCLEPCN 

229E WSKVHEVII DNKPWRWMLW CKDNAVATFY PQLQS-AEWK CGYSMPGIYK TQRMCLE PCN 

PEDV VVSKVHEVMV DCKMWRWMLW CKDHKLQTFY PQLQA-SEWK CGYSMPSIYK IQRMCLEPCN 

TGEV VVSKVVDVIV DCKAWRWMLW CENSHIKTFY PQLQS-AEWN PGYSMPTLYK IQRMCLERCN 

on OV43 CVSKVVNVNV DFKDFQFMLW CNDEKVMTFY PRLQAASDWK PGYSMPVLYK YLNSPMERVS 

RoCoV CVSKVVNVNV DFKDFQFMLW CNDEKVMTFY PRLQAASDWK PGYSMPVLYK YLNSPMERVS 

MHV CVSKVVNVNV DFKDFQFMLW CNEEKVMTFY PRLQAAADWK PGYVMPVLYK YLES PLERVN 

RTBV NKSKWTVSI DYHSINFMTW FEDGSIKTCY PQLQS — AWT CGYNMPELYK VQNCVMEPCN 

SARS COV VISKVVKVTI DYAEISFMLW CKDGHVETFY PKLQASQAWQ PGVAMPNLYK MQRMLLEKCD 

30 | i I 1 . ... I ... - 1 -•••]• •••;J;:*- 1 

7085 7095 7105 7115 7125 7135 

FMCR LYNYGAGIKL PSGIMLNWK YTQLCQYLNS TTMCVPHNMR VLHYGAGSDK GVAPGTTVLK 

229E LYNYGAGLKL PSGIMFNVVK YTQLCQYFNS TTLCVPHNMR VLHLGAGSDY GVAPGTAVLR 

oc pFnv LYNYGAGVKL PDGIMFNVVK YTQLCQYLNS TTMCVPHHMR VLHLG AG SDK GVAPGTAVLR 

TGEV LYNYGAQVKL PDGITTNWK YTQLCQYLNT TTLCVPHKMR VLHLGAAGAS GVAPGSTVLR 

OV43- LWNYGKPVTL PTGCMMNVAK YTQLCQYLNT TTLAVPVNMR VLHLGAGSEK GVAPGSAVLR 

RoCoV LWNYGKPVTL PTGCMMNVAK YTQLCQYLNT TTLAVPVNTR VLHLGAGSEK GVAPGSAVLR 

MHV LWNYGKPITL PTGCLMNVAK YTQLCQYLNT TTLAVPANMR VLHLGAGSDK DVAPG SAVLR 

AH AIBV IPNYGVGITL PSGILMNVAK YTQLCQYLSK TTICVPHNMR VMHFGAGSDK GVAPGSTVLK 

SARS COV LQNYGENAVI PKGIMMNVAK YTQLCQYLNT LTLAVPYNM* VIHFGAGSDK GVAPGTAVLR 
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" 7205 7215 7225 7235 7245 7255 

FMCR RIKFCDGE NVSKDGFFTY LNGVIREKLA IGGSVAIKIT EYSWNKYLYE LIQRFAFWTL 

229E RTKAIDGE NVSKEGFFTY INGFICEKLA IGGSIAIKVT EYSWNKKLYE LVQRFSFWTM 

PEDV --KIKSCDGE NVSKEGFFPY INGVITEKLA LGGTVAIKVT EFSWNKKLYE LIQKFEYWTM 

TGEV — STKSIDGE NTSKDGFFTY INGFIKEKLS LGGSVAIKIT EFSWNKDLYE LIQRFEYWTV 

QU ov43 ITKNIGEY NVSKDGFFTY ICHMIRDKLA LGGSVAIKIT EFSWNAELYK LMGY FAFWT V 

BoCoV — LLLDIGVH VVRCS YI HCHMIRDKLA LGGSVAIKIT EFSWNAELYK LMGY FAFWT V 

MHV — LTKNIGEY NVSKDGFFTY LCHLIRDKLA LGGSVAIKIT EFSWNAELYS LMGKFAFWTI 

AIBV SKRKHEGVIA NNGNDDVFIY LSSFLRNNLA LGGSFAVKVT ETSWHEVLYD IAQDCAWWTM 

65 SARS COV — RTKHVTKE NDSKEGFFTY LCGFIKQKLA LGGSIAVKIT EHSWNADLYK LMGHFSWWTA 
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' U PEDV FCTSVNTSSS EAFLIGVHYL GDFASGAVID GNTMHANYIF WRNSTIMTMS YNSVLDLSKF 

TGEV FCTSVNTSSS EGFLIGINYL GPYCDKAIVD GNIMHANYIF WRNSTIMALS HNSVLDTPKF 

OV43 FCTNANASSS EGFLIGINYL CKPKV — EID GNVMHANYLF WRNSTVWNGG AYSLFDMAKF 

BOCOV FCTNANASSS EGFLIGINYL GKPKV — EID GNVMHAI ICF G 

7 c mhv FCTNVNASSS EGFLIGINWL NRTRT— EID GKTMHANYLF WRNSTMWNGG AYSLFDMSKF 

AIBV FCTAVNASSS EAFLIGVNYL GASEK-VKVS GKTLHANYIF WRNCNYLQTS AYSIFDVAKF 

SARS COV FVTNVNASSS EAFLIGANYL GKPKE — QID GYTMHANYIF WRNTNPIQLS SYSLFDMSKF 
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BMCR ECKHKATVVV TLKDSDVNDM 

22 9E NCKHKATVVV QLKDSDINEM 

PEDV NCKHKATVVV NLKDSSISDV 

TGEV KCRCNNALIV NLKEKELNEM 

OV43 PLKLAGTAVI NLRADQINDM 

BOCOV EIPQFGTGVL IACLIWLNSR 

MHV PLKVAGTAVV SLKPDQINDL 

AIBV DLRLKATPVV NLKTEQKTDL 

SARS CoV PLKLRGTAVM SLKENQINDM 
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VLSLIKSGRL LLRNSGRFGG FSNHLVSTK- 
VLSLVRSGKL LVRGNGKCLS FSNHLVSTK- 
VLGLLKNGKL LVRNNDAICG FSNHLVNVNK 
VIGLLRKGKL LIRNNGKLLN FGNHFVNTP- 
VYSLLEKGKL LIRDTNKEVF VGDSLVNVI- 

LSWLVMP 

VLSLIEKGKL LVRDTRKEVF VGDSLVNVK- 
VFNLIKCGKL LVRDVGNTSF TSDSFVCTM— 
IYSLLEKGRL IIRENNRVW SSDILVNN— 
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MRSLIYFWLL LPVLP T LSLPQDV 

MKKLFVV LVVMP L IYGDNFP 

MIVLTLC LFLFL-YSSV SCTSNND 

MIVLVTC LLLLCSYHTV LSTTNNE 

MKKLFVV LVVMP L IYG 

MFLIL LISLPTAFAV IGDLKCTSDT SYINDKDTGP 

MFLIL LISLPMAFAV IGDLKCT — T VSINDVDTGA 

MLFVF LTLLPSSLGY IGDFRCIQ-L VNTDTSNASA 

MLFVF LTLLPSCLGY IGDFRCIN-L VNTRISNARA 

MFFIL LISLPSAFAV IGDLKCT — T SLINDVDTGV 



RCQSTTNF RRFFS 

SKLTNRTIGN QWNLIETFLL 
VQVNVTQLPG NENIIKDFLF 
IQVNVTQLAG NENLIRDFLF 



PPISTDTVDV 
PSISTDIVDV 
PSVSTEVVDV 
PSVSTEWDV 
PSISSEWDV 



TNGLGTYYVL 
TNGLGTYYVL 
SKGIGTYYVL 
SKGLGTYYVL 
TNGLGT FY VL ' 
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-CTTFDDVQA PNYTQHTSSM 
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f^IVTGLLP VHWICAN QSTSSYPANG FFYIDVG-KH RSAFALHSGY 



KFN — VQAPA VVVLGGYLPS MNSSSWYCGT 

NYSSRLPPNS DWLGDYFPT — VQPWFNCI 

QN FKEEG SLWGGYYP TEVWYNCS 

SN FKEEG SWVGGYYP- — TEVWYNCS 

DR VYLNT TLFLNGYYPT SGSTYRNMAL 

DR VYLNT TLLLNGYYPT SGSTYRNMAL 

DR VYLNA TLLLTGYYPV DGSMYRNMAL 

DR VYLNA TLLLTGYYPV DGSMYRNMAL 

DR VYLNT TLLLNGYYPI SGATFRNMAL 



GIETASGVHG 
RNDSNDLYVT 
TTQQTTAYKY 
RTARTTAFQY 

KGSVLLSRLW 
KGTLLLSRLW 
TGINTISLNW 
MGTNTLSLNW 
KGTRLLSTLW 



IFLSYIDSGQ 
LENLKALYWD 
FSNIHAFYFD 
FNNIHAFYFV 

FKPPFLSDFI 
FKPPFLSDFI 
YKPPFLSEFN 
FEPPFLSEFN 
FKPPFLSPFN 
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LNAPVTLKIC KFGN 
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TSFDFLS 



FDPSGYQLYL 
— HRQRLNVW 
ARGKPLLVHV 
ARGKPLLFHV 



HKATNG N TNAIARLRIC QFPDN KTLGPTVN 

2^^? ITV TTTRN FNSAEGAIIC ICKGSPPTTT TESSLTCNWG 

HGNPVSIIVY ISAYRDDVQF RPLLKHGLLC ITKN — DTVD YNSFTINOWR 
HGEPVSVIX- -SAYRDDVQQ RPLLKHGLVC ITKN — RHIN YEQF^SNQWN 



KVI KDRVMYS 
KVIKKGVMYS 
KASLPKDSIS 
KASLPIGSAS 
RFSKDGVIYS 



SNVVRGWVFG STMNNKSQS- 



STF VNTSYSVVVQ PRTINSTQDG YNKLQGLLEV 

vwrln STF VNTSYSVVV Q PHTTN L DNKLQGLLEI 

YFPTIIIG SNF VTTSYTWLE PYN -rTTM* 

YFPTIIIG SNF VNTSYTVVLE PYN gttma 

!f s ! sivvE phtsl — 1 ngnlqgllqi 

VII INNSTNWIR ACN FEL CDNPFFAVSK 
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DVTTGRNCLF 
SECR-LNHKF 
DICLGDDRKI 
STCTGADRKI 
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SVCQYNMCEY 
SVCQYTMCEY 
SICQYTICQL 
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PQTICHPNLG 
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PYTDCKPNTG 
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KLTKLSVKCY 

ALLHIAG 

DWSRVATRCY 
QWSGTVTFGD 
NWFNNVTLLY 
NWFNNVTLLY 

— NHRKELWH LDTGVVSCLY KRNFTYDVNA DYLYFHFYQ- 
NKRVELWH WDTGVVSCLY KRNFTYDVNA DYLYFHFYO- 
G-NKLIGFWH TELKSPVCIL KRNFTFNVNA EWLYFHFYQ- 



GKDIVVGITW 
CGNMLYGLQW 
-GTKLFGLEW 
-GTKIYGLEW 



DNDRVTVF-A 
FADEVVAYLH 
NDDYVTAYIS 
NDDFVTAYIS 
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TRTFYVPAAY 
— MFVLLVAY 
DKIYHFYLKN 
GASYRISFEN 
DESHRLNINN 
GRSYHLNINT 
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NRRSCAM QYVYTP 

MRATTLEVAG TLVDLWWFNP 
SRTSTATWQH S — AAYVYQG 
SRSSTATWEY S — AAYAYQG 
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. EMCR S 
229E S 
PEDV 
TGEV 
8 0 CaCoV 
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-T-ITVNVTT i 

CQT 

TY-YMLNVTS 
VYDVSYYRVN 
VSNFTYYKLN 
VSNFTYYKLN 



EGG 

EGG 

QGG 

QGG 

EGG 
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YQP 



-TFYAYFTDT 
-TFYAYFTDT 
-TFYAYYADV 
-TFYAYYADV 
-TFYAYFTDT 

YVYYYQS 

IDVVRDLPSG 
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LNGRIVNYTV 
TNGLNTSYSV 
AGEDGIYYEP 
NKNGTTWSN 
KTAGLKSYEL 
NTNGLKTYEL 

TSVVSN 

GWTKFLFNV 
GWTKFLFNV 
SSATTFLFSM 
SSATTFLFSS 
GFVTKFLFKL 
AFRPPSGWHL 
FNTLKPIFKL 
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CDD CNGY 

CNG CVGY 

CTAN — CTGY 
CTD — QCASY 
CEDYEYCTGY 
CEDYEHCTGY 
CTD — QCASY 

YLG MALS 

YLG TVLS 

YIG DVXT 

YIG AVLT 

YLG TVLS 

QGG AYAV 

PLG-INITNF 



....I I 

295 
TDNIFSVQQD 
SENVFAVESG 
AANVFATDSN 
VANVFTTQPG 
ATNVFAPTSG 
ATNVFAPTSG 
VANVFTILPG 
HYYVMPLTCN 
HYYVLPLTCS 
QYFVLPYMCT 
QYFVLPYMCS 
HYYVMPLTCN 
VNISSEFNNA 
RAILTAFSPA 
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GRIPNGFPFN 
GYIPSDFAFN 
GHIPEGFSFN 
GFIPSDFSFN 
GYIPDGFSFN 
GYIPDGFSFN 
GFIPSDFSFN 
SKVKNGFTLE 

S AMTLE 

LTTTGVFSPQ 
PTTSGVSSPQ 

S ALSLE 

G SSS 

QDIWGTSAAA 



315 
NWFL-LTNGS 
NWFL-LTNTS 
NWFL-LSNDS 
NWFL-LTNSS 
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NWFL-LTNSS 
NWFL-LTNSS 
YWVTPLTSRQ 
YWVTPLTSKQ 
YWVTPLVKRQ 
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TLVDGVSRLV 
SWDGWRSF 
TLLHGKVVSN 
TLVSGKLVTK 
TFVSGRFVTN 
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YLLAFNQDGV 
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QPLRLTCLWP 
QPLLLNCLWS 
QPLLVNCLLA 
QPLLVNCLWP 
QPLLVNCLWP 
QPLLINCLWP 
QPLLVNCLWP 
IFNAVDCMSD 
IFNAVDCKSD 
ITSAVDCASS 
ITSAVDCASS 
LYHAVDCASD 
TAPSSGMAWS 
ITDAVDCSQN 
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VPGLKSSTGF 
VSGLRFTTGF 
IPKIYGLGQF 
VPSFEEAAST 
VPSFGVAAQE 
VPSFGVAAQE 
VPSFEEVAST 
FMSEIKCKTQ 
FMSEIKCKTL 
YTSEIKCKTQ 
YTSEIKCKTQ 
FMSEIMCKTS 
SSQFCTAHCN 
PLAELKCSVK 
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VYFNATGSDV 
VYFNGTG-RG 
FSFNHTM-DG 
FCFEGAG-FD 
FCFEGAQ-FS 
FCFEGAQ-FS 
FCFEGAD-FD 
SIAPPTG-VY 
SIAPSTG-VY 
SMNPNTG-VY 
SMNPNTG-VY 
SITPPTG-VY 
FSDTTVFVTH 
SFEIDKG-IY 
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NCNGYQHNSV 
DCKGFSSDVL 
VCNGAAVDRA 
QCNGAVLNNT 
QCNGVSLNNT 
QCNGVSLNNT 
QCNGAVLNNT 
ELNGYTVQPI 
ELNGYTVQPI 
DLSGYTVQPV 
DLSGYTVQPV 
ELNGYTVQPV 
CYKHGGCPLT 
QTSNFRWPS 
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ADVMRYNLNL 
SDVIRYNLNF 
PEALRFNIND 
VDVIRFNLNF 
VDVIRFNLNF 
VDVIRFNLNF 
VDVIRFNLNF 
ADVYRRKLNL 
ADVYRRIPNL 
GLVYRRVRNL 
GLVYRRVRNL 
ATVYRRIPDL 
GMLQQNLIRV 
GDWRFPNIT 
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SANSVDNLKS 

EEN — LRR 

TSV ILAE 

TTNVQSGKGA 
TTDVQSGMGA 
TADVQSGMGA 
TTNVQSGKGA 
PNCNIEAWLN 
PDCNIEAWLN 
PDCKIEEWLT 
PDCKIEEWLA 
PNCDIEAWLN 
SAMKNGQLFY 
NLCPFGEVFN 
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GVIVFKTLQY 
GTILFKTSYG 
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TVFSLNTTGG 
DKSVPSPLNW 
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TTIPFGPSSQ 
AHIPFGTVLG 
FAIPLGATEV 
GEIPFGVTDG 
GEIPFGVTDG 
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GEIPFGVTNG 
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ADSFTCNNID 
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GDLVYTSNET 
FSTFKCYGVS 
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PYYCFINSTI 
NFYCFVNTTI 
PYYCFLKVDT 

PRYCYV H 

PRYCYV L 

PRYCYV L 

PRYCYV L 

AAKIYG — MC 
AAKIYG — MC 
ASKVYG — MC 
ASKVYG — MC 
ASRLYG — MC 
IDVTSAG — V 
ATKLND — LC 
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NTTHVSTFVG 
GNETTSAFVG 
YNSTVYKFLA 
YNGTALKYLG 
YNGTALKYLG 
YNGTALKYLG 
YNGTALKYLG 
FSSITIDKFA 
FSSITIDKFA 
FGSISIDKFA 
FGSISIDKFA 
FGSITIDKFA 
YFKAGGPITY 
FSNVYADSFV 
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LGKSGLLQSF 
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GTDDDVSGFW TIASTNFVDA 
— TGDSDVFW TIAYTSYTEA 
— TGASGAFW TIAYTSYTEA 
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LVNVSATNIQ 
LVNVSQTSIA 
LIEVQGTSIQ 
LVQVENTAIT 
LVQVENTAIK 
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NLLYCDSPFE 
NIIYCNSVIN 
RILYCDDPVS 
KVTYCNSHVN 
KVTYCNSHIN 
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CQLYYSLP— 
LACQYNTG— 
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-TGDSDVFW 
— AANVSVS 
— AANVSVS 
— KNNVTVN 
— QDNVTVI 
— AANVSVT 
— NFSDGFY 
NID 



TIAYTSYTEA 
TIAYTSYTEA 
RFNPSTWNKR 
RFNPSTWNRR 
NHNPSSWNRR 
NHNPSSWNRR 
HYNPSSWNRR 
PFTNSSLVKQ 
ATSTGNYNYK 



LVQVENTAIK 
LVQVENTAIT 
FGFIEDSVFK 
FGFTEQFVFK 

YGFND 

YGFND 

YGFNN 

KFIVYR 

YRYLR 



NVTYCNSHIN 
NVTYCNSYVN 
PRPAGVLTNH 
PQPVGVFTHH 
-VATFGTGKH 
-VATFHSGEH 
-QSFGSRGLH 
ENSVNT 
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NIKCSQLTAN 
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DVAYAEACFT 
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TCTLHNFIFH 
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LNNGFYPVSS 
LQNGFYPVAS 
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CKPRQVNISL 
CYPAGVNITL 
ANLVASDTTI 
IASTLSNITL 
IASPLSNITL 
IASTLSNITL 
IASTLSNITL 
TPDPITFKAT 
TPDPITSKST 
NPSPLTTYDL 
NPSPLTTYDP 
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— RNLLSHEQ 
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— SEVG — FV 
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P ET YVALPIYYQH 

p VS IVSLPVYHKH 

p IS FVTLPSFNDH 

N KS VVLLPSFYTH 
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-KS WLLPSFYSH 
-KS VVLLPSFFTY 



N KS VVLLPSFLTH 

G KNNG IGTCPAGTNY 

GIDAGYKNSG IGTCPAGTNY 
G K-PN FANCPTGTSN 
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G VQNIQTYQTK 

D ISNVPFSPDG 



585 
TDINFTATA- 
TFIVLYVDFK 
SFVNITVSA- 
TIVNITIGLG 
TSVNITIDLG 
TAVNITIDLG 
TIVNITIGLG 

LTCDN 

LTCHNAA 

RECTVMPLAN 
RECNVQASG- 
RKCFAAVTK- 
TAQSGYYNFN 
KPCTP 
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— SFGGSCYV 
PQSGGGKCFN 
— AFGG-LSS 
-MKRSGYGQP 
-MKRS-VTVT 
-MKLSGYGQP 
-MKRSGYGQP 

LC 

QCDCLC 

-NQFKCDCTC 
-FKSKCDCTC 
— ATKCTCWC 
FSF 
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N GNTSV 

ANFNETKGPL 

N GFSSF 

PMQDHNTDVY 
PMQDNNIDVY 
PMQDNNTDVY 
PMQDNNNDVY 
GTYKCPQTKS 
GPYKCPQTKY 
— R-CLQARS 
— R-CLQARS' 
NAWTCPQSKV 
FMYGSYHPSC 
P— AL 
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GTCPFSFSKL 
GNCPFSFGKV 
SNCPFTLQSV 
GTCPFSFDKL 
GTCPFSFDKL 
GTCPFSFDKL 
GTCPFSFDKL 
GWSADSCLQG 
GWSVDSCLOG 
GWAKDSCLAN 
GWAMDSCLSN 
GWSSETCLQN 
GGCKQSVFKG 
GYQPYRVVVL 

- I 



675 
NNFQKFKTIC 
NNFVKFGSVC 
NDYLSFSKFC 
NNYLTFNKFC 
NNYLTFNKFC 
NNYLTFNKFC 
NNYLTFNKFC 
DKCNIFANFI 
DRCNIFANFI 
GRCHIFSNLM 
ARCHI FSNLM 
GRCNIFANFI 
RATCCYAYSY 
S FELLN 
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625 
CVRTSHFSIR 
CVDTSHFTTK 
CVDTRQFTIT 
CIRSDQFSVY 
CIRSNQFSVY 
CIRSNQFSVY 
CVRSDQFSVY 
LVGIGEHCSG 
LVGIGEHCSG 
MLGVGDHCEG 
MLGVGDHC EG 
SIQPGQHCPG 
KFRLETINNG 
NCYWPLNDY- 

....|....| 

685 
FSTVEVPGSC 
FSLKDIPGGC 
VSTSLLAGAC 
LSLSPVGANC 
LSLNPVGANC 
LSLSPVGANC 
LSLSPVGANC 
LHDVNSGLTC 
FHDVNSGTTC 
LNGINSGTTC 
LNGINSGTTC 
LNDVNSGTTC 
GGPSLCKGVY 
APATVCGPKL 
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635 
YIYNRVKSGS 
YVAVYANVG- 
LFYNVTNSYG 
VHSTCKSALW 
VHSTCKSSLW 
VHSTCKSSLW 
VHSTCKSVLW 
LAVKSDYCGG 
LAIKSDYCGG 
LGVLEDKCGG 
LGILEDKCGG 
LGLVEDDCSG 
LWFNSLSVS- 
G 
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695 
NFPLEATW — 
AMPIVANW — 
TIDLFGYP — 

KFDVAAR 

KLDVAAR 

KFDVAAR 

KFDVAAR 

STDLQKANTD 
STDLQKSNTD 
SMDLQLPNTE 
STDFQLPNTE 
STDLQQGNTI 

SGELDHN 

STDLIKN 



645 
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DNIFKRNCTD 
DNNFNSACTD 
DNIFNQDCTD 
DNVFKRNCTD 

N 

N 

S N 

S N 

N 
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705 
HYTSYTIVGA 
AYSKYXTIGS 
AFGSGVBCLTS 
TRTNEQWRS 
TRTNEQVFGS 
TRTNEQVVRS 
TRTNDQVVRS 
IILGVCVNYD 
IILGVCVNYD 
WTGVCVKYD 
VVTGVCVKYD 
ITTDVCVNYD 

FECGLLV 

QCVNFN 



655 
DSSWHIYLKS 
— RWSASINT 

YVSKSQD 

VLDATAVIKT 
VLDATAVIKT 
VLEATAVIKT 
VLDATAVIKT 
SCTCRPQAFL 
PCTCQPQAFL 
TCNCSAHAFV 
ICNCSADAFV 
PCTCKPQAFI 

IAYGPLQ 

FYTTTGI 

....I.... | 

715 
LYVTWSEGNS 
LYVSWSDGDG 
LYFQFTKGEL 
LYVIYEEGDN 
LYVIYEEGDN 
LYVIYEEGDN 
LYVIYEEGDS 
LYGILGQGIF 
LYGITGQGIF 
LYGITGQGIF 
LYGSTGQGVF 
LYGITGQGIL 
YVTKSGGSRI 
FNGLTGTG-V 



725 
ITGVPYPVSG 
ITGVPQPVEG 
ITGTPKPLEG 
IVGVPSDNSG 
IVGVPSD^SG 
IVGVPSDNSG 
IVGVPSDNSG 
VEVNATYYNS 
VEVNATYYNS 
KEVKADYYHS 
KEVKADYYNS 
IEVNATYYNS 
QTATEPPVIT 
LTPSSKRFQP 



735 
IREFSNLVLN 
VSSFMNVTLD 
ITDVSFMTLD 
VHDLSVLHLD 
LHDLSVLHLD 
LHDLSVLHLD 
LHDLSVLHLD 
WQNLLYDSNG 
WQNLLYDSNG 
WQNLLYDVNG 
WQNLLYDVNG 
WQNLLYDSSG 
QNNYNNITLN 
FQQFGRDVSD 



785 



I 



795 



I 



745 
NCTKYNIYDY 
KCTKYNIYDV 
VCTKYTIYGF 
SCTDYNIYGR 
SCTDYNIYGR 
SCTDYNIYGR 
SCTDYNIYGR 
NLYGFRDYIT 
NLYGFRDYLT 
NLIGFRDFVA 
NLNGFRDIVT 
NLYGFRDYLS 
TCVDYNIYGR 
FTDSVRDPKT 



805 



755 
VGTGIIRSSN 
SGVGVIRVSN 
KGEGtlTLTN 
TGVGIIRQTN 
TGVGIIRKTN 
TGVGIIRRTN 
TGVGIIRQTN 
NRTFMIRSCY 
NRTFMIRSCY 
NKSYTIRSCY 
NKTYLLRSCY 
NRTFLIRSCY 
TGQGFITNVT 
SEILDISPCS 



815 



765 775 

QSLAGGITYV S 

DTFLNGITYT S 

SSILAQVYYT .S— - 

RTLLSGLYYT S 

STLLSGLYYT S 

STLLSGLYYT S 

RTILSGLYYT S 

SGRVS AAFHA N 

SGRVSAAFHA N 

SGRVSAAYHQ D 

SGRVSAAYHQ D 

SGRVSAVFHA N 

DSAVSYNYLA DAGLAILDTS 
FGGVSVITPG TN A 
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NSGNLLGFKN 

TSGNLLGFKD 

DSGQLLAFKN 

LSGDLLGFKN 

LSGDLLGFKN 

LSGDLLGFKN 

LSGDLLGFTN 

SSEPALLFRN 

SSEPALLFRN 

APEPALLYRN 

APEPALLYRN 

SSEPALMFRN 

GSIDIFWQG 

SSEVAVLYQD 
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VSTGNIFIVT 
VTKGTIYSIT 
VTSGAVYSVT 
VSDGVIYSVT 
VSDGWYSVT 
VSDGVIYSVT 
VSDGVIYSVT 
IKCNYVFNNS 
IKCNYVFNNT 
LKCDYVFNNN 
LKCDYVFNNN 
LKCSHVFNNT 
EYGLNYYKVN 
VNCTDVSTAI 



PCNQPDQVAV 
PCNPPDQLW 
PCSFSEQAAY 
PCDVSAQAAV 
PCDVSAQAAV 
PCDVSAQAAV 
PCDVSAQAAI 
LTRQLQPINY 
LSRQLQPINY 
ISREETPLNY 
ISREETPLNY 
ILRQIQLVNY 
PCEDVNQQFV 
HADQLTPAWR 

I' | 



YQQ-SIIGAM 

YQQ-AWGAM 

VND-DIVGVI 

IDG-TIVGAI 

IDG-AIVGAM 

IDG-AIVGAM 

IDG-TIVGAI 

FDS-YLGCW 

FDS-YLGCVV 

FDS-YLGCVV 

FDS-YLGCVI 

FDS-YLGCVV 

VSGGKLVGIL 

IYS-TGNNVF 



TAVNESRYGL 
LSENFTSYGF 
SSLSNS — TF 
TSINSELLGL 
TSINSELLGL 
TSINSELLGL 
TSINSELLGL 
NAYNSTAISV 
NADNSTSSW 
NADNSTEEAV 
NADNSTEQSV 
NAYNNTASAV 
TSRNETGSQL 
QTQAGCLIGA 



QNLLQLPNFY 
SNWELPKFF 
NNTRELPGFF 
THWTTTPNFY 
THWTTTPNFY 
THWTTTPNFY 
THWTTTPNFY 
QTCDLTVGSG 
QTCDLTVGSG 
DACDLRMGSG 
DACDLRMGSG 
STCDLTVGSG 
LENQFYIKIT 
EHVDTSYECD 
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845 

YVS NG 

YAS NG 

YHS ND 

YYSI YNY 

YYSI YNY 

YYSI YNY 

YYSI YNY 

YCVD YSK 

YCVD YST 

LCVN YST 

LCVN YSI 

YCVD YVT 

NGTRRFRRSI 
I PIG AGI 



855 865 

GNN CTTAV 

TYN CTDAV 

GSN CTEPV 

TNDRTRGTAI DSNDVDCEPV 
TNVMNRGTAI D-NDIDCEPI 
TSERTRGTAI DSNDVDCEPV 
TNDKTRGTPI GSNDVDCEPV 

NRR SRGAI 

KRR SRRAI 

SHR ARSSV 

AHR ARRSV 

ALR SRRSF 

TEN VANCPY 

CAS YHTVSL 
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875 
MIYSNFGICA 
LTYSSFGVCA 
LVYSNIGVCK 
ITYSNIGVCK 
ITYSNIGVCK 
ITYSNIGVCK 
ITYSNIGVCK 
TTGYRFTNFE 
TTGYRFTNFE 
STGYKLTTFE 
STGYKLTTFE 
TTGYRFTNFE 
VSYGKFCIKP 
LRSTSQKSIV 
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885 
DGSLIPVRPR 
DGSIIAVQPR 
SGSIGYV-PS 
NGAFVFIN-V 
NGALVFIN-V 
NGALVFIN-V 
NGALVFIN-V 
PFTVNSVN — 
PFTVNSVN — 
PFTVRIVN — 
PFTVSIVN — 
PFAANLVN — 
DGSIATIVPK 
AYTMSLG 
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895 
NSSDNGISAI 
NVSYDSVSAI 
QYGQVKIAPT 
THSDGDVQPI 
THSDGDVQPI 
THSDGDVQPI 
THSDGDVQPI 

DSLEPVG 

DSLEPVG 

DSVESVD 

DSVESVG 

DSIEPVG 

QLEQFVAPLF 
ADS SI AY 
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905 
-ITANLSIPS 
-VTANLSIPS 
-VTGNISIPT 
-STGNVTIPT 
-STGNVTIPT 
-STGNVTIPT 
-STGNVTIPT 
-GLYEIQIPS 
-GLYEIQIPS 
-GLYELQIPT 
-GLYEMQIPT 
-GLYEIQIPS 
NVTENVLIPN 
-SNNTIAIPT 



915 
NWTTSVQVEY 
NWTTSVQVEY 
NFSMSIRTEY 
NFTISVQVEY 
NFTISVQVEY 
NFTISVQVEY 
NFTISVQVEY 
EFTIGNMEEF 
EFTIGNMEEF 
NFTIASHQEF 
NFTIASHQEF 
EFTIGNLEEF 
SFNLTVTDEY 
NFSISITTEV 



925 
LQITSTPIW 
LQITSTPIW 
LQLYNTPVSV 
IQVYTTPVSI 
IQVYTTPVSI 
MQVYTTPVSI 
IQVYTTPVSI 
IQTSSPKVTI 
IQTSSPKVTI 
VQTRSPKVTI 
IQTRSPKVTI 
IQTRSPKVTI 
IQTRMDKVQI 
MPVSMAKTSV 

I I 
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935 
DCATYVCNGN 
DCSTYVCNGN 
DCATYVCNGN 
DCSRYVCNGN 
DCARYVCNGN 
DCARYVCNGN 
DCSRYVCNGN 
DCAAFVCGDY 
DCSAFVCGDY 
DCAAFVCGGH. 
DCAAFVCGDY 
DCATFVCGDY 
NCLQYVCGSS 
DCNMYICGDS 

-I I 



945 
PRCKNLLKQY 
VRCVELLKQY 
SRCKQLLTQY 
PRCNKLLTQY 
PRCNKLLTQY 
PRCNKLLTQY 
PRCNKLLTQY 
AACKSQLVEY 
AACKSQLVEY 
TACRQQLVEY 
TACRQQLVDY 
AACRQQLAEY 
LDCRKLFQQY 
TECANLLLQY 
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955 
TSACKTIEDA 
TSACKTIEDA 
TAACKTIESA 
VSACQTIEQA 
VSACQTIEQA 
VSACQTIEQA 
VSACQTIEQA 
GSFCDNINAI 
GSFCDNINAI 
GSFCDNINAI 
GSFCDNINAI 
GSFCENINAI 
GPVCDNILSV 
GSFCTQLNRA 
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965 
LRLSAHLETN 
LRNSARLESA 
LQLSARLESV 
LAMGARLENM 
LAMGARLENM 
LAMGARLENM 
LAMGARLENM 
LTEVNELLDT 
LTEVNELLDT 
LGEVNNLIDT 
LGEVNNLIDT 
LTEVNELLDT 
VNSVGQKEDM 
LSGIAAEQDR 



975 
DVSSMLTFDS 
DVSEMLTFDK 
EVNSMLTISE 
EVDSMLFVSE 
EIDSMLFVSE 
EVDSMLFVSE 
EVDSMLFVSE 
TQLQVANSLM 
TQLQVANSLM 
MQLQVASALI 
MQLQVASALI 
TQLQVANSLM 
ELLNFYSSTK 
NTREVFAQVK 



985 
NA-FSLANVT 
KA-FTLANVS 
EA-LQLATIS 
NA-LKLASVE 
NA-LKLASVE 
NA-LKLASVE 
NA-LKLASVE 
NG-VTLSTKL 
NG-VTLSTKL 
QG-VTLSSRL 
QG-VTLSSRL 
NG-VTLSTKI 
PAGFNTPVLS 
QM-YKTPTLK 



995 

SFG D 

SFG D 

SFNG DG 

AFN SS 

AFN ST 

AFN ST 

AFN SS 

KDGVNFNVDD 
KDGVNFNVDD 
SDGIGGQIDD 
ADGISGQIDD 
KDGINFNVDD 

NVSTG E 

YFG G 
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1005 1015 

YNLSSVLPQ 

YNLSSVIPS 

YNFTNVLGAS 

ETLDPIYKEW PNIGGSWLEG 
ENLDPIYKEW PNIGGSWLGG 
ENLDPIYKEW PSIGGSWLGG 
ETLDPIYKEW PNIGGFWLEG 

INFSPVLGCL G 

INFSPVLGCL G 

INFSPLLGCL G 

INFSPLLGCL G 

INFSPVLGCL G 

FNISLLLTN 

FNFSQILPDP 



1025 

RNIHSS 

LPTSGS 

V — YDPASGR 
LKYILPSHNS 
LKDILPSHNS 
LKDILPSHNS 
LKYILPSDNS 
-SECSKASS- 
-SACNKVSS- 
-SDCGEVTMA 
-SDCSEGTKA 
-SECNRAST- 



1035 
RIAGRSALED 
RVAGRSAIED 
WQKRSVIED 
KRKYRSAIED 
KRKYRSAIED 
KRKYGSAIED 
KRKYRSAIED 

RSAIED 

RSAIED 

AQTGRSAIED 
AQ-GRSAIED 
RSAIED 



1045 
LLFSKVVTSG 
ILFSKLVTSG 
LLFNKVVTNG 
LLFDKVVTSG 
LLFDKVVTSG 
LLFDKVVTSG 
LLFSKVVTSG 
LLFDKVKLSD 
LLFSKVKLSD 
VLFDKVKLSD 
VLFDKVKLSD 
LLFDKVKLSD 



1055 
LGTVDVDYKS 
LGTVDADYKK 
LGTVDEDYKR 
LGTVDEDYKR 
LGTVDEDYKR 
LGTVDEDYKR 
LGTVDEDYKR 
VG-FVEAYNN 
VG-FVEAYNN 
VG-FVEAYNN 
VG-FVESYNN 
VG-FVQAYNN 
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1065 
CTKGLS — 
CTKGLS— 
CSNGRS— 
CTGGYD — 
SAGGYD-* 
CTGGYD- 
CTGGYD- 
CTGGAE- 
CTGGAE- 
CTGGQE- 
CTGGQE- 
CTGGAE- 



1075 
IA DLACAQYYNG 
IA DLACAQYYNG 
VA DLVCAQYYSG 
•IA DLVCAQYYNG 
•IA DLVCARYYNG 
•IA DLVCAQYYNG 
-IA DLVCAQYYNG 
-IR DLICVQSYKG 
-IR DLICVQSYNG 
-VR DLLCVQSFNG 
-VR DLLCVQSFNG 
-IR DLICVQSYNG 
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TAG DRG IAPKSGYFVN VN NTWMYTGSGY YYPEPITENN VWMSTCAVN 

tCo DRG IAPKSGYFVN VN NTWMFTGSGY YYPEPITGNN VWMSTCAVN 

TSG DRG LAPKAGYFVQ DD GEWKFTGSNY YYPEPITDKN SVVMSSCAAN 

III DRG LAPKAGYFVQ DH GEWKFTGSNY YYPESITDKN SWMSSCAVN 

2fi DIG ISPKSGYFIN VN NSWMFTGSSY YYPEPITQNN VWMSTCAVN 

VKPANASQYA IVPANGRGIF IQVN GSYYITARDM YMPRAITAGD WTLTSCQAN 

HEG----KA YFPREGVFVF NG TSWFITQRNF FSPQIITTDN T FVS GNCDW 
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1385 
FVNISRVELH 
FVNISRSELQ 
YVNLTSDQLP 
FVNATVSDLP 
FVNGTVIELP 
FVNATVIDLP 
FVNTTVSDLP 
YTKAPYVMLN 
YTKAPDVMLN 
YTKAPEVFLN 
YTKAPEVFLN 
YTKAPDLMLN 
YVSVNKTVIT 
IGIINNTVYD 
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1395 
TVIP-DYVDV 
TIVP-EYIDV 
DVIP-DYIDV 
SIIP-DYIDI 
SIIP-DYIDI 
SIIP-DYIDI 
SIIP-DYIDI 
TSIP-NLPDF 
ISTP-NLHDF 
TSIP-NLPDF 
TSIT-NLPDF 
TSTP-NLPDF 
TFVDNDDFDF 
PLQP-ELDSF 



1405 
NKTLQEFAQN 
NKTLQELSYK 
NKTLDEILAS 
NQTVQDILEN 
NQTVQDILEN 
NQTVQDILEN 
NQTVQDILEN 
KEELDQWFKN 
KEELDQWFKN 
KEELDKWFKN 
KEELDKWFKN 
KEELYQWFKN 
NDELSKWWND 
KEELDKYFKN 



• • • . I » • . » 1 

1415 
L-PKYVKPNF 
L-PNYTVPDL 
L-PNRTGPSL 
FRPNWTVPEL 
FRPNWTVPEL 
YRPNWTVPEF 
FRPNWTVPEL 
QTSVAPDLSL 
QTSVAPDLSL 
QTSIAPDLSL 
QTSIVPDLSF 
QSSVAPDLSL 
T — KHELPDF 
HTSPDVDLGD 



••••(•■••I 

1425 
DLTPFNLTYL 
WEQYNQTIL 
PLDVFNATYL 
TFDIFNATYL 
PLDIFHATYL 
TLDI FNATYL 
TL D V FN AT YL 
DY — INVTFL 
DY — INVTFL 
DFEKLNVTLL 
DIGKLNVTFL 
DY — INVTFL 
DKFNYTVPIL 
ISG-INASW 



. , • . | ...» | 

1435 
NLSSELKQLE 
NLTSEISTLE 
NLTGEIADLE 
NLTGEIDDLE 
NLTGEINDLE 
NLTGEIDDLE 
NLTGEIDDLE 

DLQVEMN 

DLQDEMN 

DLTDEMN 

DLSYEMN 

DLQDEMN 

DIDSEID 

NIQKEID 



I I 

1445 
AKTASLFQTT 
NKSAELNYTV 
QRSESLRNTT 
FRSEKLHNTT 
FRSEKLHNTT 
FRSEKLHNTT 
FRSEKLHNTT 



I . . 

1455 
VELQGLIDQI 
QKLQTLIDNI 
EELRSLINNI 
VELAILIDNI 
VELAILIDNI 
VELAILIDNI 
VELAILIDNI 
-RLQEAIKVL 
-RLQEAIKVL 
-RIQDAIKKL 
-RIQDAIKNL 
-RLQEAIKVL 
-RIQGVIQGL 
-RLNEVAKNL 

I 



■ I 



1465 
NSTYVDLKLL 
NSTLVDLKWL 
NNT LVDLEWL 
NNTLVNLEWL 
NNTLVNLEWL 
NNTLVNLEWL 
NNTWNLEWL 
NQSYINLKDI 
NQSYINLKDI 
NESYINLKDV 
NESYINLKEI 
NQSYINLKDI 
NDSLIDLEKL 
NESLIDLQEL 



•,••}>•*•{ 

1475 
NRFENYIKWP 
NRVETYIKWP 
NRVETYIKWP 
NRIETYVKWP 
NRIETYVKWP 
NRIETYVKWP 
NRIETYVKWP 
GTYEYYVKWP 
GTYEYYVKWP 
GTYEMYVKWP 
GTYEMYVKWP 
GTYEYYVKWP 
SILKTYIKWP 
GKYEQYIKWP 



• • • . | .... 1 

1485 
WWVWLIISW 
WWVWLCISW 
WWVWLIIVIV 
WYVWLLIGLV 
WYVWLLIGLV 
WYVWLLIGLV 
WYVWLLIGLV 
WYVWLLICLA 
WYVWLL I GFA 
WYVWLLIGLA 
WYVWLLIGLA 
WYVWLLIGLA 
WYVWLAIAFA 
WYVWLGFIAG 



I | 

1495 
FVVLLSLLVF 
LIFVVSMLLL 
LIFVVSLLVF 
VIFCIPLLLF 
VIFCIPILLF 
WFCIPLLLF 
VIFCIPLLLF 
GVAMLVLLFF 
GVAMLVLLFF 
GVAVCVLLFF 
GVAVCVLLFF 
GVAMLVLLFF 
TIIFILILGW 
LIAIVMVTIL 



1505 
CCLSTGCCGC 
CCCSTGCCGF 
CCISTGCCGC 
CCCSTGCCGC 
CCCSTGCCGC 
CCFSTGCCGC 
CCCSTGCCGC 
ICCCTG-CG- 
ICCCTG-CG- 
ICCCTG-CG- 
ICCCTG-CG- 
ICCCTG-CG- 
VFFMTGCCGC 
LCCMTSCCS- 



1515 
CNCLTSSMRG 
FSCFASSIRG 
CGCCGACFSG 
IGCLGSCCHS 
IGCLGSCCHS 
IGCLGSCCHS 
IGCLGSCCHS 
— TSCFKKCGG 
— TSCFKICGG 
-SCCFKKCGN 
-SCCFKKCGN 
—TSCFKKCGG 
CCGCFGIMPL 
-CLKGACSCG 



. ... 1 .... I 
. 1525 
CCDCGSTKLP 
CCES — TKLP 
CCRG-PRLQP 
ICSR-RQFEN 
ICSR-GQFES 
ICSR-RQFEN 
IFSR-RQFEN 
CCDDYTGYQE 
CCDDYTGHQE 
CCDECGGHQD 
CCDEYGGRQA 
CCDDYTGHQE 
MSKCGKKSSY 
SCCKFDEDDS 



. . . . I I ..-.!.... 

1535 1545 

YYEFEKVHVQ — . 

YYDVEKIHIQ 

YEAFEKVHVQ 

YEPIEKVHVH 

YEPIEKVHVH 

YEPIEKVHVH 

YEPIEKVHVH 

LVIKT SH DD 

LVIKT SH DD 

SIVIHNISSH ED 

GIVIHNISSH ED 

FVIKT SH DD 

YTTFDNDVVT EQYRPKKSV 
EPVLKGVKLH YT 



£ Putative Orf 4a 



. ... i .... i . ... i .... i 

5 15 25 35 45 55 

EMCR 4a MPFGGLFQLT LESTINKSVA NLKLPPHDVT VLRDNLKPVT TLSTITAYLL VSLFVTYFAL 

229E 4a MALG-LFTLQ LVSAVNQSLS NAKVSAEVSR QVIQDVKDGT VTFNLLAYTL MSLFVVYFAL 

I .... I I I . ... I .... I .-..I.. ..I . ... I .... I 

65 75 85 95 105 115 

EMCR 4a FKPLTARGRV ACFVLKLLTL SVYVPLLVLF GMYLDSFIIF FLRCCFDSYM LAIMPISNKN 

229E 4a FKARSHRGRA ALIVFKILIL FVYVPLLYWS QAYIYATLIA VILLG-RFFH TAWHCWLYKT 

I.... | | .... ! I .... I ....I.... I ....I.... I 

" 125 135 145 155 165 175 
EMCR 4a FSFVLFNVTK LCFVSGKCWY LEQSFYENRF AAIYGGDHYV VLGGETITFV SFDDLYVAIR 
229E 4a WDFIVFNVTT LCYAR 



185 195 205 215 * "oL 

229B 4a -f-™?™ ?™™^! I"MBPW GIVYSSQLYE DVPSIN 



g. Putative Orf 4ab 



EMCR 4a 
229E 4a 
229E 4b 



I 



15 



I 



I 



MPFGGLFQLT LESTINKSVA NLKLPPHDVT VLRDNLKPVT TLSTITAYTT vet™™™. 

^VNQSLS NAKVSAEVSR QVIQD^K Km XSSEEE 



EMCR 4a 
229E 4a 
229E 4b 



....|....| ....|....| ..... , . 

• 7 5 85 95 i ac * 

FKPLTARGRV ACFVLKLLTL SVYVPLLVLF GMYLDSFIIF FLRCCFDSYM T atmotcuvm 

™^ ™f™ v"S 



EMCR 4a 
229E 4a 
229E 4b 



EMCR 4a 
229E 4a 
229E 4b 



125 



I 



135 



I 



145 



I ....|. ...| ....|....| 

WDF™T LEQiEW A^YGGDHYV VLGGETITFV SFDD^VAIR 

MQGKCW FLENKALKPF VCFYGGDQFL YIGDRIVSYF STNDLYVALR 

....|. ...| ....|.. ..| ....I.. I , 
185 195 205 215 * "iii 

GSCEKNLQLM RKVDLYNGAV IYIFAEEPVV GIVYSSQLYE DVPSIN 

GRIDKDLSLS RKVELYNGEC VYLFCEHPAV GIVNTDFKLE IhTTTT 



h. Putative Orf E 



EMCR E 

229E 

PEDV 

TGEV 

CaCoV 

FeCov 

Por Resp C 

OC43 

BoCoV 

PHEV 

MHV 

Rat CoV 

AIBV 

SARS 



EMCR E 

229E 

PEDV 

TGEV 

CaCoV 

FeCoV 

Por Resp C 

OC43 

BoCoV 

PHEV 

MHV 

Rat CoV 

AIBV 

SARS 



MFLRLI 

MFLKI#V 

MLQLV 

MTFPRALTVI 
MTFPRALTVI 
MTFPRAFTII 
MTFPRALTVI 
— MFMADAYL 
— MFMADAYF 
— MFMADAYL 

MFNLFL 

MFNLFL 

— MNLLNKSL 
MYSFVS 



1 • I 

15 

DDNG-IVLNS 
DDHA-LVVNV 
NDNG-LVVNV 
DDNG-MVINI 
DDNG-MVISI 
DDHG-MWSV 
DDNG-MVISI 
ADTV-WYVGQ 
ADTV-WYVGQ 
ADTV-WYVGQ 
TDTV-WYVGQ 
IDTV-WYVGQ 
EENG-SFLTA 
EETGTLIVNS 



• I • ... I 

25 

ILWLLVMIFF 
LLWCWLIVI 
ILWLFVLFFL 
IFWFLLIIIL 
IFWFLLIIIL 
FFWLLLIIIL 
IFWFLLIIIL 
IIFIVAICLL 
IIFIVAICLL 
IIFIVAICLL 
IIFIVAVCLM 
IIFIVAVCLM 
LYIIVGFLAL 
VLLFLAFVVF 



•>«••!• — I 
35 

F-VLAMTFIK 
L-LVCITIIK 
L-IISITFVQ 
I-LLSIALLN 
I-LFSIALLN 
I-LFSIALLN 
I-LLSIALLN 
VTIVWAFLA 
VIIVVVAFLA 
VIIVVVAFLA 
VTIIVVAFLA 
VTIIVVAFLA 
Y-LLGRALQA 
L-LVTLAILT 



....I. ...| 
05 

LIQLCFTCHY 
LIKLCFTCHM 
LVNLCFTCHR 
IIKLCMVCCN 
IIKLCMVCCN 
VIKLCMVCCN 
IIKLCMVCCN 
TFKLCIQLCG 
TFKLCIQLCG 
TFKLCIQLCG 
SIKLCIQLCG 
SIKLCIQLCG 
FVQAADACCL 
ALRLCAYCCN 



....|..,-| 
55 

FFSRTLYQP- 
FCNRTVYGP- 
LCNSAVYTP- 
LGRTVIIVP- 
LGRTVIIVP- 
LGKTIIVLP- 
LGRTVIIVP- 
MCNTLVLSP- 
MCNTLVLSP- 
MCNTLVLSP- 
LCNTLLLSP- 
LCNTLLLSP- 
FWYTWWIPG 
IVNVSLVKP- 



....|....| 
65 

VYKIFL- 

IKNVYH- 

IGRLYR- 

AQHAYD- 

ARHAYD- 

ARHAYD- 

VQHAYD- 

. -r~SXYVENR 

SIYVFNR 

SIYVFNR 

— SICVYNR 

SIYVYNR 

AKGTAFVYKY 
TVYVYS- 



75 



I 



GR 

GR 

SK 



. I .... I 
85 

AYQDYM 

IYQSYM 

VYKSYM 

AYKNFM 

AYKNFM 

AYKTFM 

AYKNFM 

QFYEFYN 

QFYEFYN 

QFYEFYN 

QLYKYYN 



SK QLYKYYN 

TYGRKLNNPE LEAVIVNEFP 
RVKNLN 



....|....| 
95 

— QIAPV-PA 
— HIDPF-PK 
— RIDPL-PS 
— RIKAYNPD 
— QIRAYNPD 
— QTKAYNPD 
— RIKAYNPD 
DVKPP-VL 
— DVKPP-VL 
— DVKPP-VL 
E-EVRPP-PL 
E-EVRPP-PL 
KNGWNNKN PA 
— SSEGV-PD 



-...I. ...| 
105 

EVLNV 

RVIDF 

TVIDV 

GAL LA 

EALLV 

EAFLV 

GALLV — — 

DVDDV-- — - " 

DVDDV 

DVDDV 

EVDDIIIQTL — 
EVDDIIIQTL — 
NFQDAQRDKL YS 
LLV 



i. Putative Orf M (Matrix protein) 



15 SARS 



OC43 
PHEV 
BoCoV 
MHV 



, 15 25 35 45 55 
5 15 1! M SNSS 



EMCR M SNDN- 

229E ~ M SNGS 

PEDV _ Z MK ILLILACVIA CACGERYCAM KSDTDLSCRN 

5 TGEV - MKK ILFLLACAIA CVYGERYCAM TESS-TSCRN 

i a ?.°l ^MMPIRPLC KPRHIIPTKH FWFELNKMKY ILLILACIIA CVYGERYCAM QDSG-LQCIN 

F6COV MHMMP IRPLC KPRHliriRn r MR ILLILRCAIA CTCGERYCAM KDDTGLSCRN 

PRCOV ~- M SSKT 

OC43 M SSPT 

10 PHEV "~_ M SSVT 

BoCoV ~~~ M TSTTQ 

MHV ~~ ~_S- M SSTTP 

RatSAV "~ M PNETN 

AIBV ~~ M ADNG 



■ , | ...| ....|....| ....I. ...I • I I ...-I- ...I 

a" 75 85 95 105 115 



PM - n . V PLSEVYVHLR NWNFSWNLIL TVFIWLQYG HYKYSRLLYG LKMSVLWCLW 

?n 229E " C — TGDIVTHLK NWNFGWNVIL TIFIVILQFG HYKYSRLFYG ^VLWLLW 

^ U "9B PVDEVIEHLR NWNFTWNIIL TILLWIiQYG HYKYSVFLYG VKMAILWILW 

I™ STASDCESCF NGGDLIWHLA NWNFSWSIIL IVFITVLQYG RPQFSWFVYG IKMLIMWLLW 

r»rL STAGNCASCF ETGDLIWHLA NWNFSWSVIL IIFITVLQYG RPQFSWFVCG IKMLIMWLLW 

Itrll CTNSRCOTCF KlIWHLA NWNFSWSVIL IVFITVLQYG RPQFSWLVYG IKMLIMWLLW 

OK rTalnCESCF NRGDLIWLLA NWNFSWSIIL IIFITVLQYG RPQFSWFVYG IKMLIMWLLW 

25 "Sf -!t!apvyiw S S ewnfslgiil lfitiilqfg ytsrsmfvyv ikmiilwlmw 

° C43 — TPVPVISW TADEAIKFLK EWNFSLGIIV LFITIILQFG YTSRSMFVYV IKMVILWLMW 
--TPAPVYTW TADEAIKFLK EWNFSLGIIL LFITIILQFG YTSRSMFVYV IKMIILWLMW 
-APQPVYQW TADEAIRFLK EWNFSLGIIL LFVTIILQFG YTSRSMFVYV VKMILLWLMW 

OA Smav --APQTVYQW TADVAVRFLK EWNFLLGIIL LFITIILQFG YTSRSMFIYV VKMIILWLMW 

30 RatSAV DFEQSVQLFK EYNLFITAFL LFLTIILQYG YATRSKVIYT LKMIVLWCFW 

**** _il TVEELKQLLE QWNLVIGFLF LAWIMLLQFA YSNRNRFLYI IKLVFLWLLW 

....(....l 1 

oc. 125 135 145 155 165 lv5 

•33 • pLVLALSIFD CFVNFNVD-W VFFGFSILMS IITLCLWVMY FVNSFRLWRR VKTFWAFNPE 
PTTOJiLSIFD TWANWDSH-W AFVAFSFFMA VSTLVMWVMY FANSFRLFRR ARTFWAWNPE 
vffKfsilma CITLMLWIMY FVNSIRLWRR TKSWWSFNPE 

™ pSot ayseyqvsry vmfgfsiaga ivtfvlwimy fvrsiqlyrr TKSWWSFNPE 

, n AYLEYRVSRY VMFGFSVAGA TVTFILWIMY FVRSIQLYRR TKSWWSFNPE 

40 ?S Z ■ P^WrFN AYSEYQVSRY VMFGFSVAGA WTFALWMMY FVRSVQLYRR TKSWWSFNPE 

PRCOV PIVLALTIFN S VMFGFSIAGA IVTFVLWIMY FVRSIQLYRR TKSWWSFNPE 
25? • PLTIILTIFN — CVYALN-N VYLGLSIVFT IVAIIMWIVY FVNSIRLFIR TGSFWSFNPE 

pHEV PLTIILTIFN — CVYALN-N VYLGFSIVFT IVAIIMWWY FVNSIRLFIR TGSWWSFNPE 

A C PLTIILTIFN —CVYALN-N VYLGFSIVFT IVAIIMWIVY FVNSIRLFIR TGSWWSFNPE 

45 IZl SSS — CVYALN-N VYLGFSIVFT IVSIIMWIMY FVNSIRLFIR TGSWWSFNPE 

RatSAV PLTIVLCIFN -CVYALN-N VYLGFSIVFT IVSIVMWIMY FVNSIRLFIR TGSWWSFNPE 

^BV PLNIAVGVIS — CTYPPN-T GGLVAAIILT VFACLSFVGY WIQSIRLFKR CRSWWSFNPE 

saro pvSfvla -AVYRIN-W VTGGIAIAMA CIVGLMWLSY FVASFRLFAR TRSMWSFNPE 

50 i I ...I.. ..| . ... I .... I — I 

185 195 205 215 225 235 

VMrR TNAIISLQVY -GHNYYLPVM AAPTGVTLTL LSGVLLVDGR KIATRVQVGQ LPKYVIVATP 

S vnaitvttVl -GQTYYQPIQ QAPTGITVTL LSGVLYVDGH RLASGVQVHN LPEYMTVAVP 

c c p"v TDALLTTSVM -GRQVCIPVL gaptgvtltl LSGTLLVEGY KVATGVQVSQ LPNFVTVAKA 

55 ?KAILCVSAL -GRSYVLPLE GVPTGVTLTL LSGNLYAEGF KIAGGMNIDN LPKYVMVALP 

r*CoV TSAILCVSAL -GRSYVLPLE GVPTGVTLTL LSGNLCAEGF KIAGGMNIDN LPKYVMVALP 

FeCoV TNAILCVNAL -GRSYVLPLD GTPTGVTLTL LSGNLYAEGF KMAGGLTIEH LPKYVMIATP 

PRCOV TOAILCVSAL -IrSYVLPLE GVPTGVTLTL LSGNLYAEGF KIAGGMTIDN LPKYVMVALP 

60 OC43 TNNLMCI DMK -GTMYVRPII EDYHTLTVTI IRGHLYIQGI KLGTGYSLAD LPAYMTVAK- 

PHEV TNNLMCIDMK -GRMYVRPII EDYHTLTATI IRGHLYIQGI KLGTGYSLSD LPAYVTVAK- 

BoCoV TNNLMCIDMK -GRMYVRPII EDYHTLTVTI IRGHLYMQGI KLGTGYSLSD LPAYVTVAK- 

MHV TNNLMCIDMK -GTVYVRPII EDYHTLTATI IRGHLYMQGV KLGTGFSLSD LPAYVTVAK- 

RatSAV TNNLMCIDVK -GTVYVRPII EDYHTLTATN VRGHLYMQGV KLGTGFSLSD LPAYVTVAK- 

65 AIBV SNAVGSILLT NGQQCNFAIE SVPMVLSPII KNGVLYCEGQ WLAK-CEPDH LPKDIFVCTP 

bb | R rs TNILLNVPLR — GTIVTRPLM ESELVIGAVI IRGHLRMAGH PLGR-CDIKD LPKEITVAT- 

....|....| ....I- -.-I 
' 245 255 265 275 285 

m FMCR STTIVCDRVG RSVNBTSQTG WAFYVRAKHG DFSGVASQEG VLSEREKLLH LI 

229E STTIIYSRVG RSVNSQNSTG WVFYVRVKHG DFSAVSSPMS NMTENERLLH FF 

PEDV TTTIVYGRVG RSVNASSGTG WAFYVRSKHG DYSAVSNPSA VLTDSEKVLH LV 

TGEV SRTIVYTLVG KKLKASSATG WAYYVKSKAG DYSTEAR-TD NLSEQEKLLH MV 

CaCoV VRTIVYTLVG KKLKASSATG WAYYVKSKAG DYSTDAR-TD NLSEHEKLLH MV 

1 o FeCoV SRTIVYTLVG KQLKATTATG WAYYVKSKAG DYSTEAR-TD NLSEHEKLLH MV 

PRCOV SRTIVYTLVG KKLKASSATG WAYYVKSKAG DYSTEAR-TD NLSEQEKLLH MV 

OC43 VTHLCTYKRG FLDRISDTSG FAVYVKSKVG NYRLPSTQKG SGMDTALLRN NI 

PHEV VTHLCTYKRG FLDRIGDTSG FAVYVKSKVG NYRLPSTHKG SGMDTALLRN NI 

BOCOV VSHLLTYKRG FLDKIGDTSG FAVYVKSKVG NYRLPSTQKG SGMDTALLRN NI 

30 MHV VSHLCTYKRA FLDKVDGVSG FAVYVKSKVG NYRLPSN-KP SGMDTALLR- -I 



RatSAV 

AIBV 

SARS 



AC/99 



j, Putative Orf N (Nucleoprotein) 



EMCR 

229E 

PEDV 

TGEV 

FeCoV 

PRCoV 

CaCoV 

RSDACOV 

MHV 

PHEV 

OC43 

B0C0V 

SARS 

AIBV 



EMCR 

229E 

PEDV 

TGEV 

FeCoV 

PRCoV 

CaCoV 

RSDACoV 

MHV 

PHEV 

OC43 

B0C0V 

SARS 

AIBV 



EMCR 

229E 

PEDV 

TGEV 

FeCoV 

PRCoV 

CaCoV 

RSDACoV 

MHV 

PHEV 

OC43 

B0C0V 

SARS 

AIBV 



EMCR 

229E 

PEDV 

TGEV " 

FeCoV 

PRCoV 

CaCoV 

RSDACoV 

MHV 

PHEV 

OC43 

B0C0V 

SARS 

AIBV 



■ I I 

5 15 25 

— Mas VN W 

MAT VK W 

— MAS VS F 

— MANQGQR VS W 

— MATQGQR- VN W 

MANQGQR VS W 

MASQGQR VS W 

MSFVPGQENA GSRSSSGNRA GNGILKKTTW 
MSFVPGQENA GSRSSSGNRA GNGILKKTTW 

MSFTPGKQSS -SRASSGNRS GNGILK W 

MSFTPGKQSS -SRASSGNRS GNGILK W 

MSFTPGKQSS — SRASFGNRS GNGILK W 

MSDNGPQS NQRSAPRITF 

MASG K A 



35 

ADDR 

ADASEPQ 

QDRG 

GDESTKT 

GDEPSKR 

GDESTKI 

GDESTKR 

ADQTERGQNN 

ADQTERG 

ADQSDQSRNV 
ADQSDQFRNV 
ADQSDQSRNV 
GGPTDSTDNN 
AGKTDAPAPV 



45 

-AARKKF- 
-RGRQGR- 



55 



— RGRSNSRG 
— RGRSNSRG 
— RGRSNSRG 
— RGRSNSRG 
GNRGRRNQPK 
-NRGRRNHPK 
QTRGRRVQSK 
QTRGRRAQPK 
QTRGRRAQPK 
QNGGRNGARP 
IKLGGPKPPK 



RKNNN 

RKNND 

RKINN 

RKNND 

QTATTQ-PNT 
QTATTQ-PNA 
QTATSQQPSG 
QTATSQQPSG 
QTATSQLPSG 

KQRRPQ 

VGSS 



I I 

65 

PPPSFY 

IPYSLY 

VPLSLY 

IPLSFF 

IPLSFY 

IPLSFF 

IPLSFF 

GSWPHYSWF 
GSWPHYSWF 
GTVVPYYSWF 
GNWPYYSWF 
GNVVPYYSWF 
GLPNNTASWF 
GNASWF 



• - • • I . . . . I 
75 

MPLLVSSDKA 
SPLLVDSE-Q 
APLRVTNDKP 
NPITLQQGSK 
NPITLEQGSK 
NPITLQQGAK 
NPITLEQGSK 
SGITQFQKGK 
SGITQFQKGK 
SGITQFQKGK 
SGITQFQKGK 
SGITQFQKGK 
TALTQHGK-E 
QAIKAKKLNT 



-...I.... I 
85 

PYRVIPRNLV 
PWKVIPRNLV 
LSKVLANNAV 
FWNLCPRDFV 
FWNLCPRDLV 
FWNSCPRDFV 
FWDLCPRDFV 
EFQFAGGQGV 
EFQFAQGQGV 
EFEFAEGQGV 
EFEFVEGQGV 
EFEFAEGQGV 
ELRFPRGQGV 
PPPKFEGSGV 



95 

PIGKGNK-DE 
PINKKDK-NK 
PTNKGNK-DQ 
PKGIGNR-DQ 
PKGIGNK-DQ 
PKGIGNR-DQ 
PKGIGNK-DQ 
PIANGIPPSE 
PIASGIPASE 
PIAPGVPSTE 
PIAPGVPATE 
PIAPGVPATE 
PINTNSGPDD 
PDNENIKPSQ 



105 
QIGYWNVQER 
LIGYWNVQKR 
QIGYWNEQIR 
QIGYWNRQTR 
QIGYWNRQIR 
QIGYWNRQTR 
QIGYWNRQTR 
QKGYWYRHNR 
QKGYWYRHNR 
AKGYWYRHNR 
AKGYWYRHNR 
AKGYWYRHNR 
QIGYYRRATR 
QHGYWRRQAR 



I I 

115 
— WRMRRGQR 
— FRTRKGKR 
— WRMRRGER 
— YRMVKGQR 
— YRIVKGQR 
— YRMVKGQR 
— YRMVKGRR 
RSFKTPDGQQ 
RSFKTPDGQH 
RSFKTADGNQ 
RSFKTADGNQ 
RSFKTADGNQ 
R-VRGGDGKM 
— FKPGKGGR 



....I. ...I 

125 
VDLPPKVHFY 
VDLSPKLHFY 
IEQPSNWHFY 
KELPERWFFY 
KELAERWFFY 
KELPERWFFY 
KNLPEKWFFY 
KQLLPRWYFY 
KQLLPRWYFY 
RQLLPRWYFY 
RQLLPRWYFY 
RQLLPRWYFY 
KELSPRWYFY 
KPVPDAWYFY 



135 
YLGTGPHKDL 
YLGTGPHKDA 
YLGTGPHGDL 
YLGTGPHADA 
FLGTGPHADA 
YLGTGPHADA 
YLGTGPHADA 
YLGTGPHAGA 
YLGTGPHAGA 
YLGTGPHAKD 
YLGTGPHAKD 
YLGTGPHAKD 
YLGTGPEASL 
YTGTGPAADL 
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145 
KFRQRSDGVV 
KFRERVEGW 
RYRTRTEGVF 
KFKDKLDGVV 
KFKDKIDGVF 
KFKDKLDGVV 
KFKQKLDGVV 
SFGDSIEGVF 
EYGDDIEGW 
QYGTDIDGVF 
QYGTDIDGVY 
QYGTDIDGVF 
PYGANKEGIV 
NWGDTQDGIV 



I I 

155 
WVAKEGAKTV 
WVAVDGAKTE 
WVAKEGAKTE 
WVAKDGAMNK 
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